Statistics Ch 13

Card Set Information

Statistics Ch 13
2012-02-25 13:12:43

Introduction to Inference
Show Answers:

  1. Introduction to Inference
    • Estimates of sample mean for samples are always relatively close to the populationparameter μ.
    • 9 5% of all sample means will be within roughly 2 standard deviations (2*s/√n) of the population parameter μ.
  2. A level C confidence interval for a parameter has two parts:
    • (1) An interval calculated from the data, usually of the form
    • estimate ± margin of error

    (2) A confidence level C, which gives the probability that the interval will capture the true parameter value in repeated samples, or the success rate for the method.
  3. confidence interval
    for a population mean can be expressed from a sample mean as:

  4. confidence level C
    (in %) indicates the success rate of the method that produces the interval. It represents the area under the normal curve within ± m of the center of the curve.

  5. (in %)
    confidence level C
  6. Standardizing the normal curve using z

    Ex. For a 98% confidence level, z*=2.326
  7. Hypotheses tests
    • 1. test of statistical significance
    • 2. null hypothesis
    • 3. alternative hypothesis
  8. test of statistical significance
    • tests a specific hypothesis using sample data to decide on the
    • validity of the hypothesis.
  9. hypothesis
    is an assumption, or a theory about the characteristics of one or more variables in one or more populations
  10. null hypothesis
    is the statement being tested. It is a statement of “no effect” or “no difference,” and it is labeled H0.
  11. alternative hypothesis
    is the claim we are trying to find evidence for, and it is labeled Ha.
  12. two-tail or two-sided test
    • of the population mean has these null and alternative hypotheses:
    • H0: μ = [a specific number] Ha: μ Å [a specific number]
  13. one-tail or one-sided test
    • of a population mean has these null and alternative hypotheses:
    • H0: μ = [a specific number] Ha: μ < [a specific number] OR
    • H0: μ = [a specific number] Ha: μ > [a specific number]
  14. Tests for a population mean
    To test the hypothesis H0: μ = μ0 based on an SRS of size n from a Normal population with unknown mean μ and known standard deviation σ, we rely on the properties of the sampling distribution N(μ, σ√n).
  15. p-value
    is the area under the sampling distribution for values at least as extreme, in the direction of Ha, as that of our random sample.

  16. The P-value
    • Tests of statistical significance quantify the chance of obtaining a particular random sample result if the null hypothesis were true. This quantity is the P-value.
    • This is a way of assessing the “believability” of the null hypothesis given the evidence provided by a random sample.
  17. P-value in one-sided and two-sided tests
    To calculate the P-value for a two-sided test, use the symmetry of the normal curve. Find the P-value for a one-sided test and double it.

  18. Interpreting a P-value
    Could random variation alone account for the difference between the null hypothesis and observations from a random sample?

    • A small P-value implies that random variation because of the sampling process alone is not likely to account for the observed difference.
    • With a small P-value, we reject H0. The true property of the population is significantly different from what was stated in H0.

    Thus small P-values are strong evidence AGAINST H0.

    Oftentimes, a P-value of 0.05 or less is considered significant: The phenomenon observed is unlikely to be entirely due to chance event from the random sampling.
  19. The significance level a
    • The significance level, α, is the largest P-value tolerated for rejecting a true null hypothesis (how much evidence against H0 we require). This value is decided arbitrarily before
    • conducting the test.

    If the P-value is equal to or less than α (p ≤ α), then we reject Ho. If the P-value is greater than α (p > α), then we fail to reject Ho.

    When the z score falls within the rejection region (shaded area on the tail-side), the P-value is smaller than α and you have shown statistical significance.

    • Rejection region for a two-tail test of μ with α = 0.05 (5%)
    • A two-sided test means that α is spread between both tails of the curve, thus:
    • a middle area C of 1 − α = 95%, and
    • an upper tail area of α /2 = 0.025

  20. Confidence intervals to test hypotheses
    • Because a two-sided test is symmetrical, you can also use a confidence interval to test a twosided
    • hypothesis.
    • In a two-sided test, C = 1 – α.

    • C confidence level
    • α significance level
  21. Logic of confidence interval test
    • A confidence interval gives a black and white answer: Reject or don’t reject H0. But it also estimates a range of likely values for the true population mean μ.
    • A P-value quantifies how strong the evidence is against the H0. But if you reject Ho, it doesn’t provide any information about the true population mean μ.