Home > Preview
The flashcards below were created by user
firefly501
on FreezingBlue Flashcards.

Introduction to Inference
 Estimates of sample mean for samples are always relatively close to the populationparameter μ.
 9 5% of all sample means will be within roughly 2 standard deviations (2*s/√n) of the population parameter μ.

A level C confidence interval for a parameter has two parts:
 (1) An interval calculated from the data, usually of the form
 estimate ± margin of error
(2) A confidence level C, which gives the probability that the interval will capture the true parameter value in repeated samples, or the success rate for the method.

confidence interval
for a population mean can be expressed from a sample mean as:

confidence level C
(in %) indicates the success rate of the method that produces the interval. It represents the area under the normal curve within ± m of the center of the curve.

(in %)
confidence level C

Standardizing the normal curve using z
Ex. For a 98% confidence level, z*=2.326

Hypotheses tests
 1. test of statistical significance
 2. null hypothesis
 3. alternative hypothesis

test of statistical significance
 tests a specific hypothesis using sample data to decide on the
 validity of the hypothesis.

hypothesis
is an assumption, or a theory about the characteristics of one or more variables in one or more populations

null hypothesis
is the statement being tested. It is a statement of “no effect” or “no difference,” and it is labeled H0.

alternative hypothesis
is the claim we are trying to find evidence for, and it is labeled Ha.

twotail or twosided test
 of the population mean has these null and alternative hypotheses:
 H0: μ = [a specific number] Ha: μ Å [a specific number]

onetail or onesided test
 of a population mean has these null and alternative hypotheses:
 H0: μ = [a specific number] Ha: μ < [a specific number] OR
 H0: μ = [a specific number] Ha: μ > [a specific number]

Tests for a population mean
To test the hypothesis H0: μ = μ0 based on an SRS of size n from a Normal population with unknown mean μ and known standard deviation σ, we rely on the properties of the sampling distribution N(μ, σ√n).

pvalue
is the area under the sampling distribution for values at least as extreme, in the direction of Ha, as that of our random sample.

The Pvalue
 Tests of statistical significance quantify the chance of obtaining a particular random sample result if the null hypothesis were true. This quantity is the Pvalue.
 This is a way of assessing the “believability” of the null hypothesis given the evidence provided by a random sample.

Pvalue in onesided and twosided tests
To calculate the Pvalue for a twosided test, use the symmetry of the normal curve. Find the Pvalue for a onesided test and double it.

Interpreting a Pvalue
Could random variation alone account for the difference between the null hypothesis and observations from a random sample?
 A small Pvalue implies that random variation because of the sampling process alone is not likely to account for the observed difference.
 With a small Pvalue, we reject H0. The true property of the population is significantly different from what was stated in H0.
Thus small Pvalues are strong evidence AGAINST H0.
Oftentimes, a Pvalue of 0.05 or less is considered significant: The phenomenon observed is unlikely to be entirely due to chance event from the random sampling.

The significance level a
 The significance level, α, is the largest Pvalue tolerated for rejecting a true null hypothesis (how much evidence against H0 we require). This value is decided arbitrarily before
 conducting the test.
If the Pvalue is equal to or less than α (p ≤ α), then we reject Ho. If the Pvalue is greater than α (p > α), then we fail to reject Ho.
When the z score falls within the rejection region (shaded area on the tailside), the Pvalue is smaller than α and you have shown statistical significance.
 Rejection region for a twotail test of μ with α = 0.05 (5%)
 A twosided test means that α is spread between both tails of the curve, thus:
 a middle area C of 1 − α = 95%, and
 an upper tail area of α /2 = 0.025

Confidence intervals to test hypotheses
 Because a twosided test is symmetrical, you can also use a confidence interval to test a twosided
 hypothesis.
 In a twosided test, C = 1 – α.
 C confidence level
 α significance level

Logic of confidence interval test
 A confidence interval gives a black and white answer: Reject or don’t reject H0. But it also estimates a range of likely values for the true population mean μ.
 A Pvalue quantifies how strong the evidence is against the H0. But if you reject Ho, it doesn’t provide any information about the true population mean μ.

