Card Set Information
cumulative stats 121 final
Central Limit Theorem
the sampling distribution of a statistic (x-bar, p-hat) is approximately normal whenever the sample is large and random.
an estimate of the value of a parameter in interval form with an associated level of confidence; it gives a list of plausible values for the parameter based on the value of the statistic
The percentage of all possible samples for which the confidence intervals will contain the parameter being estimated; selected subjectively by the researcher (95%, 98%)
the percent of time that the confidence interval estimation procedure gives confidence intervals that contain the value of the parameter
A chart Plotting the means (x-bars) of regular samples of size n against time, it has a center line and upper and lower control limits to determine whether a process is in or out of control.
Lines on either side of the center line computed using μ-3(σ/√n) and μ+3(σ/√n)
A sample type where the researcher contacts those subjects who are readily available and does not use any random selection. Results are almost always biased.
the difference (or distance) between an observation and the mean of all the observations in a data set, or the difference between an observation and the corresponding regression model estimate.
an estimate of how many observations should be in a cell of a two way table if Ho is true (no association between row and column variables)
the amount of total variation in the y's that is accounted for by a regression model; it is equal to ∑(yhat - ybar)2
predicting a y value for an x value that is outside the range of observed x's. dangerous and discouraged.
the distribution that models the ratio of two variance estimates; used in ANOVA for obtaining the p-value for testing equality for 3 or more means.
minimum, Q1, median, Q3, maximum; used when data are very skewed or outliers present
difference between Q3 and Q1; or the length of the box in a boxplot; contains 50% of the data
law of large numbers
the mean of observed values in a sample (x-bar) will tend to get closer and closer to μ as the sample size increases
the distribution of only one variable in a two way table (the percentages for a single row or column)
performing two or more test of significance on the same data - INFLATES the overall α.
the actual count in a sample given in a two way table
the difference between the observed value of the statistic and the hypothesized value of the corresponding parameter (xbar - μo)
what makes process Out of Control
one sample mean outside the control limits, or nine sample means in a row above or below the center line in a control chart.
a characteristic (mean, median, proportion) of the population
1-β; probability of making a correct decision by rejecting a false null hypothesis;
increases when α increases, or when n increases
when the difference between the observed statistic and claimed parameter value is large enough to be worth reporting (only assess if results are statistically significant)
an interval estimate of plausible values for a single observation of Y at a specified value of X
the percentage of total variation in y that is explained by x
the difference between the actual y and the predicted y
Sampling Distribution of X-bar
a distribution of the sample mean; a list of all the possible values for x-bar together with the frequency of each value
Sampling Distribution of P-hat
a distribution of the sample proportion; a list of all the possible values for p-hat together with the frequency of each value
α; probability of making a Type I error (rejecting a true null hypothesis)
Standard Deviation of P-hat
Variability of samp. dist. of p-hat; √(p(1-p)/n)
Standard Deviation of X-bar
variability of samp. dist. of x-bar; σ/√n
population is divided into strata based on a characteristic and SRS is taken from each strata
t-test (when needed?)
test of significance, used when σ is unknown
Type I error
when a true null hypothesis is rejected (believing Ha is true, when Ho is true)
Type II error
when a false null hypothesis is not rejected (believing Ho, when Ha is true)
the number of standard deviations a value or observation is from the mean
What is matched pairs?
data where 2 measurements are taken at different times (or under different conditions) on each individual in a sample (one sample, two treatments)
the probability of getting a value of the test statistic as extreme or more extreme than the value actually observed, assuming Ho is true.
What is the probability that the null hypothesis is true?
1 or 0. it is or it isn't.
Margin of Error
the maximum amount that a statistic will differ from the value of the parameter it estimates for the middle --% (90, 95, 98) of statistics.
Chi-squared: what size does each expected count in each cell need to be or larger?
chi-squared: what are the degrees of freedom?
(r-1)(c-1) where r=number of rows, and c=number of columns
Chi-squared: what is Ho?
there is no association between the rows and columns variables
Chi-squared: what is Ha?
There is an association between the rows and columns variables.