Stats final

The flashcards below were created by user TTan777 on FreezingBlue Flashcards.

1. Central Limit Theorem
the sampling distribution of a statistic (x-bar, p-hat) is approximately normal whenever the sample is large and random.
2. Confidence Interval
an estimate of the value of a parameter in interval form with an associated level of confidence; it gives a list of plausible values for the parameter based on the value of the statistic
3. Confidence Level
The percentage of all possible samples for which the confidence intervals will contain the parameter being estimated; selected subjectively by the researcher (95%, 98%)

the percent of time that the confidence interval estimation procedure gives confidence intervals that contain the value of the parameter
4. Control Chart
A chart Plotting the means (x-bars) of regular samples of size n against time, it has a center line and upper and lower control limits to determine whether a process is in or out of control.
5. Control Limits
Lines on either side of the center line computed using μ-3(σ/√n) and μ+3(σ/√n)
6. Convenience Sample
A sample type where the researcher contacts those subjects who are readily available and does not use any random selection.  Results are almost always biased.
7. Deviation
the difference (or distance) between an observation and the mean of all the observations in a data set, or the difference between an observation and the corresponding regression model estimate.
8. Expected Count
an estimate of how many observations should be in a cell of a two way table if Ho is true (no association between row and column variables)
9. Explained variation
the amount of total variation in the y's that is accounted for by a regression model; it is equal to ∑(yhat - ybar)2
10. extrapolation
predicting a y value for an x value that is outside the range of observed x's. dangerous and discouraged.
11. F-distribution
the distribution that models the ratio of two variance estimates; used in ANOVA for obtaining the p-value for testing equality for 3 or more means.
12. Five-Number summary
minimum, Q1, median, Q3, maximum; used when data are very skewed or outliers present
13. interquartile range
difference between Q3 and Q1; or the length of the box in a boxplot; contains 50% of the data
14. law of large numbers
the mean of observed values in a sample (x-bar) will tend to get closer and closer to μ as the sample size increases
15. marginal distribution
the distribution of only one variable in a two way table (the percentages for a single row or column)
16. Multiple analyses
performing two or more test of significance on the same data - INFLATES the overall α.
17. Observed Count
the actual count in a sample given in a two way table
18. observed effect
the difference between the observed value of the statistic and the hypothesized value of the corresponding parameter (xbar - μo)
19. what makes process Out of Control
one sample mean outside the control limits, or nine sample means in a row above or below the center line in a control chart.
20. parameter
a characteristic (mean, median, proportion) of the population
21. Power
1-β; probability of making a correct decision by rejecting a false null hypothesis;

increases when α increases, or when n increases
22. practical significance
when the difference between the observed statistic and claimed parameter value is large enough to be worth reporting (only assess if results are statistically significant)
23. Prediction Interval
an interval estimate of plausible values for a single observation of Y at a specified value of X
24. r-squared
the percentage of total variation in y that is explained by x
25. Residual
the difference between the actual y and the predicted y
26. Sampling Distribution of X-bar
a distribution of the sample mean; a list of all the possible values for x-bar together with the frequency of each value
27. Sampling Distribution of P-hat
a distribution of the sample proportion; a list of all the possible values for p-hat together with the frequency of each value
28. Significance Level
α; probability of making a Type I error (rejecting a true null hypothesis)
29. Standard Deviation of P-hat
Variability of samp. dist. of p-hat; √(p(1-p)/n)
30. Standard Deviation of X-bar
variability of samp. dist. of x-bar; σ/√n
31. Stratified Sample
population is divided into strata based on a characteristic and SRS is taken from each strata
32. t-test (when needed?)
test of significance, used when σ is unknown
33. Type I error
when a true null hypothesis is rejected (believing Ha is true, when Ho is true)
34. Type II error
when a false null hypothesis is not rejected (believing Ho, when Ha is true)
35. z-score
the number of standard deviations a value or observation is from the mean
36. What is matched pairs?
data where 2 measurements are taken at different times (or under different conditions) on each individual in a sample (one sample, two treatments)
37. P-value
the probability of getting a value of the test statistic as extreme or more extreme than the value actually observed, assuming Ho is true.
38. What is the probability that the null hypothesis is true?
1 or 0.  it is or it isn't.
39. Margin of Error
the maximum amount that a statistic will differ from the value of the parameter it estimates for the middle  --% (90, 95, 98) of statistics.
40. Chi-squared: what size does each expected count in each cell need to be or larger?
5
41. chi-squared: what are the degrees of freedom?
(r-1)(c-1) where r=number of rows, and c=number of columns
42. Chi-squared: what is Ho?
there is no association between the rows and columns variables
43. Chi-squared: what is Ha?
There is an association between the rows and columns variables.
 Author: TTan777 ID: 212729 Card Set: Stats final Updated: 2013-04-17 17:56:36 Tags: Statistics Folders: Description: cumulative stats 121 final Show Answers: