Stats final

Card Set Information

Stats final
2013-04-17 13:56:36

cumulative stats 121 final
Show Answers:

  1. Central Limit Theorem
    the sampling distribution of a statistic (x-bar, p-hat) is approximately normal whenever the sample is large and random.
  2. Confidence Interval
    an estimate of the value of a parameter in interval form with an associated level of confidence; it gives a list of plausible values for the parameter based on the value of the statistic
  3. Confidence Level
    The percentage of all possible samples for which the confidence intervals will contain the parameter being estimated; selected subjectively by the researcher (95%, 98%)

    the percent of time that the confidence interval estimation procedure gives confidence intervals that contain the value of the parameter
  4. Control Chart
    A chart Plotting the means (x-bars) of regular samples of size n against time, it has a center line and upper and lower control limits to determine whether a process is in or out of control.
  5. Control Limits
    Lines on either side of the center line computed using μ-3(σ/√n) and μ+3(σ/√n)
  6. Convenience Sample
    A sample type where the researcher contacts those subjects who are readily available and does not use any random selection.  Results are almost always biased.
  7. Deviation
    the difference (or distance) between an observation and the mean of all the observations in a data set, or the difference between an observation and the corresponding regression model estimate.
  8. Expected Count
    an estimate of how many observations should be in a cell of a two way table if Ho is true (no association between row and column variables)
  9. Explained variation
    the amount of total variation in the y's that is accounted for by a regression model; it is equal to ∑(yhat - ybar)2
  10. extrapolation
    predicting a y value for an x value that is outside the range of observed x's. dangerous and discouraged.
  11. F-distribution
    the distribution that models the ratio of two variance estimates; used in ANOVA for obtaining the p-value for testing equality for 3 or more means.
  12. Five-Number summary
    minimum, Q1, median, Q3, maximum; used when data are very skewed or outliers present
  13. interquartile range
    difference between Q3 and Q1; or the length of the box in a boxplot; contains 50% of the data
  14. law of large numbers
    the mean of observed values in a sample (x-bar) will tend to get closer and closer to μ as the sample size increases
  15. marginal distribution
    the distribution of only one variable in a two way table (the percentages for a single row or column)
  16. Multiple analyses
    performing two or more test of significance on the same data - INFLATES the overall α.
  17. Observed Count
    the actual count in a sample given in a two way table
  18. observed effect
    the difference between the observed value of the statistic and the hypothesized value of the corresponding parameter (xbar - μo)
  19. what makes process Out of Control
    one sample mean outside the control limits, or nine sample means in a row above or below the center line in a control chart.
  20. parameter
    a characteristic (mean, median, proportion) of the population
  21. Power
    1-β; probability of making a correct decision by rejecting a false null hypothesis;

    increases when α increases, or when n increases
  22. practical significance
    when the difference between the observed statistic and claimed parameter value is large enough to be worth reporting (only assess if results are statistically significant)
  23. Prediction Interval
    an interval estimate of plausible values for a single observation of Y at a specified value of X
  24. r-squared
    the percentage of total variation in y that is explained by x
  25. Residual
    the difference between the actual y and the predicted y
  26. Sampling Distribution of X-bar
    a distribution of the sample mean; a list of all the possible values for x-bar together with the frequency of each value
  27. Sampling Distribution of P-hat
    a distribution of the sample proportion; a list of all the possible values for p-hat together with the frequency of each value
  28. Significance Level
    α; probability of making a Type I error (rejecting a true null hypothesis)
  29. Standard Deviation of P-hat
    Variability of samp. dist. of p-hat; √(p(1-p)/n)
  30. Standard Deviation of X-bar
    variability of samp. dist. of x-bar; σ/√n
  31. Stratified Sample
    population is divided into strata based on a characteristic and SRS is taken from each strata
  32. t-test (when needed?)
    test of significance, used when σ is unknown
  33. Type I error
    when a true null hypothesis is rejected (believing Ha is true, when Ho is true)
  34. Type II error
    when a false null hypothesis is not rejected (believing Ho, when Ha is true)
  35. z-score
    the number of standard deviations a value or observation is from the mean
  36. What is matched pairs?
    data where 2 measurements are taken at different times (or under different conditions) on each individual in a sample (one sample, two treatments)
  37. P-value
    the probability of getting a value of the test statistic as extreme or more extreme than the value actually observed, assuming Ho is true.
  38. What is the probability that the null hypothesis is true?
    1 or 0.  it is or it isn't.  
  39. Margin of Error
    the maximum amount that a statistic will differ from the value of the parameter it estimates for the middle  --% (90, 95, 98) of statistics.
  40. Chi-squared: what size does each expected count in each cell need to be or larger?
  41. chi-squared: what are the degrees of freedom?
    (r-1)(c-1) where r=number of rows, and c=number of columns
  42. Chi-squared: what is Ho?
    there is no association between the rows and columns variables
  43. Chi-squared: what is Ha?
    There is an association between the rows and columns variables.