Central Limit Theorem
the sampling distribution of a statistic (xbar, phat) is approximately normal whenever the sample is large and random.

Confidence Interval
an estimate of the value of a parameter in interval form with an associated level of confidence; it gives a list of plausible values for the parameter based on the value of the statistic

Confidence Level
The percentage of all possible samples for which the confidence intervals will contain the parameter being estimated; selected subjectively by the researcher (95%, 98%)
the percent of time that the confidence interval estimation procedure gives confidence intervals that contain the value of the parameter

Control Chart
A chart Plotting the means (xbars) of regular samples of size n against time, it has a center line and upper and lower control limits to determine whether a process is in or out of control.

Control Limits
Lines on either side of the center line computed using μ3(σ/√n) and μ+3(σ/√n)

Convenience Sample
A sample type where the researcher contacts those subjects who are readily available and does not use any random selection. Results are almost always biased.

Deviation
the difference (or distance) between an observation and the mean of all the observations in a data set, or the difference between an observation and the corresponding regression model estimate.

Expected Count
an estimate of how many observations should be in a cell of a two way table if Ho is true (no association between row and column variables)

Explained variation
the amount of total variation in the y's that is accounted for by a regression model; it is equal to ∑(yhat  ybar)2

extrapolation
predicting a y value for an x value that is outside the range of observed x's. dangerous and discouraged.

Fdistribution
the distribution that models the ratio of two variance estimates; used in ANOVA for obtaining the pvalue for testing equality for 3 or more means.

FiveNumber summary
minimum, Q1, median, Q3, maximum; used when data are very skewed or outliers present

interquartile range
difference between Q3 and Q1; or the length of the box in a boxplot; contains 50% of the data

law of large numbers
the mean of observed values in a sample (xbar) will tend to get closer and closer to μ as the sample size increases

marginal distribution
the distribution of only one variable in a two way table (the percentages for a single row or column)

Multiple analyses
performing two or more test of significance on the same data  INFLATES the overall α.

Observed Count
the actual count in a sample given in a two way table

observed effect
the difference between the observed value of the statistic and the hypothesized value of the corresponding parameter (xbar  μo)

what makes process Out of Control
one sample mean outside the control limits, or nine sample means in a row above or below the center line in a control chart.

parameter
a characteristic (mean, median, proportion) of the population

Power
1β; probability of making a correct decision by rejecting a false null hypothesis;
increases when α increases, or when n increases

practical significance
when the difference between the observed statistic and claimed parameter value is large enough to be worth reporting (only assess if results are statistically significant)

Prediction Interval
an interval estimate of plausible values for a single observation of Y at a specified value of X

rsquared
the percentage of total variation in y that is explained by x

Residual
the difference between the actual y and the predicted y

Sampling Distribution of Xbar
a distribution of the sample mean; a list of all the possible values for xbar together with the frequency of each value

Sampling Distribution of Phat
a distribution of the sample proportion; a list of all the possible values for phat together with the frequency of each value

Significance Level
α; probability of making a Type I error (rejecting a true null hypothesis)

Standard Deviation of Phat
Variability of samp. dist. of phat; √(p(1p)/n)

Standard Deviation of Xbar
variability of samp. dist. of xbar; σ/√n

Stratified Sample
population is divided into strata based on a characteristic and SRS is taken from each strata

ttest (when needed?)
test of significance, used when σ is unknown

Type I error
when a true null hypothesis is rejected (believing Ha is true, when Ho is true)

Type II error
when a false null hypothesis is not rejected (believing Ho, when Ha is true)

zscore
the number of standard deviations a value or observation is from the mean

What is matched pairs?
data where 2 measurements are taken at different times (or under different conditions) on each individual in a sample (one sample, two treatments)

Pvalue
the probability of getting a value of the test statistic as extreme or more extreme than the value actually observed, assuming Ho is true.

What is the probability that the null hypothesis is true?
1 or 0. it is or it isn't.

Margin of Error
the maximum amount that a statistic will differ from the value of the parameter it estimates for the middle % (90, 95, 98) of statistics.

Chisquared: what size does each expected count in each cell need to be or larger?
5

chisquared: what are the degrees of freedom?
(r1)(c1) where r=number of rows, and c=number of columns

Chisquared: what is Ho?
there is no association between the rows and columns variables

Chisquared: what is Ha?
There is an association between the rows and columns variables.

