The flashcards below were created by user
on FreezingBlue Flashcards.
The study of the collection and organization and analysis and interpretation and presentation of data,
Numerical summary and calculated measured data.
statements of research question
Designed careful study (observation, experiments)
- compute the consequences of your hypothesis
- what are the chances of seeing my result when the null hypothesis is true.
Interpret the analysis in terms of the statistic.
a characteristic or quality that can change from individual to individual (or event)
Categorical (qualitative) variable
Classified by the type or group
Values whose original observation is a catagory and order does not matter (eye color)
Categorical (qualitative) variable
Classified by the type or group
values have defined order, hierarchy that matters (year in school, grades)
grouped later to make cata data, variable us measured originally as a number
integer, whole number, no fraction
variable measured as a number
any possible numerical value like ratio, interval
Complete set of items that share at least one property in common that is the subject of a statistical analysis.
A summary measure of a population is a
A summary measure of a sample is a
attempt to understand cause-and-effect relationships. However, unlike experiments, the researcher is not able to control
An experiment is any process or study which results in the collection of data, the outcome of which is unknown. In statistics, the term is usually restricted to situations in which the researcher has control over some of the conditions under which the experiment takes place.
Explanatory (Independent) variable
In an experiment, the independent variable is thevariable that is varied or manipulated by the researcher
Response (dependent) variable
The dependent variable is the response that is measured. An independent variableis the presumed cause, whereas the dependent variable is the presumed effect.
Two or more groups compared to determine diff in value variable of interest
characterize or ID certain attributes (Qualitative/categorical) Describes values of variable.
An extraneous variable. It correlates with both the dependent var and independent var
Apparently but actually false misleadingly false. Apparent but not valid.
Randomness ensures independence. The inclusion of one indiv or even in a study does not affect the chance of an other indiv to be included in the study.
Each possible sample was the same chance of being selected as any other sample.
levels of the factor
No treatment, placebo
andomisation is the process by which experimental units (the basic objects upon which the study or experiment is carried out) are allocated to treatments; that is, by a random process
Within the experiment, different sample same procedure
External, repeating the experiment with new subjects and similar results
Categorical Variables - Frequency
The number of times the event occurred in an experiment or study.
Categorical Variables - frequency distributions
a table that displays the frequency of various outcomes in a sample. Each entry in the table contains thefrequency or count of the occurrences of values within a particular group or interval, and in this way, the table summarizes the distribution of values in the sample.
Relative frequency is another term for proportion; it is the value calculated by dividing the number of times an event occurs by the total number of times an experiment is carried out. The probability of an event can be thought of as its long-run relative frequency when the experiment is carried out many times.If an experiment is repeated n times, and event E occurs r times, then the relative frequency of the event E is defined to berfn(E) = r/n
Continuous quantitative variables - histogram
Created by constructing class intervals or bind which are equally sized ranges of values associated with the variable.
Central tendency (most likely to occur) (mean, median, mode)
How well a value is repeated (dispersion)
same in each bin
Most freq occurring value
- Lowest value has the highest frequency.
- Positive trend.
- negative trend
- highest value has the highest frequency
two modes, two separate high frequencies
Bimodal skewed distribution
One monster mode and a mini mode.
the most common numerical measure of central tendency, which is a measure of center of the distribution (mean average).
The midpoint of a distribution, the number such that half of the observation are smaller and the other half are larger.
Relationship between shape and central tendency
- right skew - y>M
- left skew - y<M
Dispersion (variability) - range
Dispersion (variability) - quarile
3 quartiles that divides the number of observations into 4 equal parts
first quartile Q1 the median of observations bellow the median.
third quartile Q3 the median of obervations above the median.
Dispersion - Inter-quartile Range
Q3 - Q1
Dispersion- variance- (s2)
the average of the squares of deviations of the observations from their mean.
standard deviation (s)
- Square root of the variance
- Measures the spread of the mean
The observations whose values to not follow the trend or bulk of the rest of the data.
- Values less than Q1 - 1.5 (IQR)
- Values more than Q3 + 1.5 (IQR)
How often a specific event occurs in repeated samples or repeated processes.
P(E)=(number of ways even
The Binomial Distribution
- 1. The experiment consists of n identical trials.
- 2. The experiment consists of one of two outcomes.
- 3. The probability success on a single trial is equal to pi and pi remains the same from trial to trial.
- 4. The trials are independent; that is, the outcome of one trial does not influence the outcomes of any other trial.
- 5. The random variable y is the number of success observed during the n trials.
p(y) = (n!/(y!(n-y)!)) *piy
The chi-square goodness-of-fit
It test whether or not the viability observed in the frequencies is simply due to random chance or if it is improbable that the specified distribution is not the frequency distribution specified.
- 1. Measurement is on at least a nominal scale.
- 2. Categories or groups are mutually exclusive.
- 3. Observations are independent.
- 4. No category has a sample size restriction.
The probability that, If the null were true, sampling variation would produce an estimate that is further away from the hypothesized value than our data estimate.
How likely it is to get that hypothesis if the null were true.
Chi-square test for independence
- a. Data are frequencies based on random sample.
- b. Samples are independent.
- c. value restriction
- 2. null: There is no relationship, the variables are independent.
- alt. :There is a relationship, the two variables are dependent.