The flashcards below were created by user
on FreezingBlue Flashcards.
The purpose of statistics
the purpose of statistics is to draw conclusions about a POPULATION using data from a SAMPLE
a subset (a smaller part) of the POPULATION we select to study
the ENTIRE COLLECTION of objects about which information is desired
a numerical summary of a sample (use statistics to draw conclusions about parameter)
a numerical summary of a population (use statistics to draw conclusions about a parameter)
Inferential (branch of statistics)
the idea of using the sample to make conclusions about the population brings about a VERY important branch (this branch)
a characteristic whose value may change (or vary) from one individual or object to the other
observations on a variable
Numerical Data (quantitative)
observations are measurements (ie number of songs on an ipod, height in inches)
Categorical data (qualitative)
observations are categories (ie fav color)
any possible values can be represented as ISOLATED on a number line. (one solid answer no halfs)
any possible values form an INTERVAL on a number line (ie height of a SLCC student, Time) height is only discrete if it is ROUNDED to the nearest inch, etc. Time is only discrete if it is rounded to the nearest second, minute, etc..
TIP: when considering if data is discrete or continuous, think about this. If a scale exists to measure the data to a smaller level, the data is continuous. For example, height could be measured to the nearest meter or centimeter or millimeter or...etc, making it continuous. Songs on an ipod cannot.
a statistical study is a planned intervention undertaken to observe the effects of one or more explanatory variables often called FACTORS on a response variable.
two ways to do study: manipulation, no manipulation
both observe the effects of factors on a response variable
Occurs when the investigator simply observes characteristics of the sample
Occurs when an investigator observes how a specific characteristics changes when the investigator MANIPULATES one or more factors (or applies a treatment, a combination of factors)
Variables in a study
the variable of primary interest to the researcher. It is often a recorded measurement
variables/factors that might explain how or why the response variable behaves as it does
variables that are NOT of interest to the study, but are thought to affect the response variable.
example of finding response variables
suppose we do a study measuring the amount of percipitation (in inches) in salt lake city, and heber city during different seasons for one year? (name what the variables are response, explanatory, extraneous)
- this is an observational study (can't make it snow everyday)
- response variable: (most interested in) Precipitation
- explanatory variables: seasons, location
- extraneous: extreme weather - lake effect, humidity
Often experimental studies use BLANK to determine the effects of an applied treatment.
the control group in an experimental study serves as a BLANK treatment to compare other treatments.
which group is the control group in following example:
Suppose Acme Chemical has devised a new anti-aging pill. In order to test it they take a sample and divide that sample into two groups. One group takes the pill twice a day for two weeks and the other group does not. Which group is the control group
group not receiving the pill
an innocuous medication, such as a sugar pill, that looks, tastes and smells like the experimental medication
just taking the "pill" or receiving the treatment even though it is a fake and you still have the desired effect on a subject
Many experimenters will not tell the groups whether they are the BLANK or the experimental group, this technique is know as BLANK
types of sampling
- simple random
- stratified sample
- convenience sample
- systematic sample
- cluster sample
simple random sample
- this is the best one
- sample is selected such that every object has the same chance of being selected
- ie have everyone put their name into a hat, shake up the hat and then draw pieces of paper for your sample
used when you want to draw conclusions about two or more different subgroups within a population (randomly select from inside the category)
ie: select a random sample of student about how great their teachers were from each of the following strata: elementary, middle, high school, college
using an easily available or convenient group to form a sample. NEVER USE THIS ONE
obtained by selecting every kth individual from the population
ie: select every 3rd name in phone book
obtained by randomly selecting groups and then selecting all individuals from the selected group. (randomly select cluster then survey everyone/object inside that cluster) the farm land and plant example
- the tendency for a sample to differ from the population in a systematic way
- ie: if you have a scale that is off by 2 pounds every time a person gets on it, this is a systematic difference from the true population values
NOTE: bias can result from the way in which the sample is selected or from the way in which information is obtained once the sample has been chosen.
common types of bias
- sampling bias
- non-responsive bias
- response bias
occurs when the sampling technique used tends to favor one part of population. (a part of the population is left out)
ie only survey students who use the gym
occurs when the individuals selected to be used in the sample do not respond and have different opinions from those who do respond
ie: people aged 70-79 tend to respond more frequently to phone survey questions than those aged 20-29
occurs when the responses do not reflect the true feelings of the respondents. Usually because of a poorly worded question.
a person, object, or some other well-defined item upon which a treatment is applied. when a person, the experimental unit is often referred to as a subject
what does it mean for the experiment to be placebo-controlled
used a placebo as the baseline. everyone thought they took the drug
what does it mean for the experiment to be double-blinded?
- randomly allocated
- some pill real some not
- neither the investigators or subjects know who received the placebo
name the population, sample, treatments, and reponse variable for the following study:
Lipitor is a cholesterol-lowering drug by pfizer. In the collaborative atrovastatin diabetes study, the effect of lipitor on cardiovascular disease was assessed in 2838 subjects aged 40-75, with type-2 diabetes, without prior history of cardiovascular disease. In this placebo-controlled, double blind experiment, subjects were randomly allocated to either lipitor 10mg daily (1429) or placebo (1411) and were followed for 4 years. The response variable was the occurrence of any major cardiovascular event.
Population: subjects aged 40-75 w/type-2 diabetes and no prior history of cardiovascular disease.
Sample: 2838 subjects aged 40-75 with/type 2 diabetes w/out history of cardiovascular disease
Treatments: 10mg lipitor daily, placebo daily for 4 years
Response variable: occurrence of any major cardiovascular event