The flashcards below were created by user
doncheto
on FreezingBlue Flashcards.

statistics
The study of the collection and organization and analysis and interpretation and presentation of data,

statistic
Numerical summary and calculated measured data.

statements of research question
hypothesis

Data collection
Designed careful study (observation, experiments)

Data analysis
 compute the consequences of your hypothesis
 what are the chances of seeing my result when the null hypothesis is true.

statistical inference
Interpret the analysis in terms of the statistic.

Variable
a characteristic or quality that can change from individual to individual (or event)

Categorical (qualitative) variable
Atrributes
Classified by the type or group
Values whose original observation is a catagory and order does not matter (eye color)

Categorical (qualitative) variable
Rank (ordinal)
Classified by the type or group
values have defined order, hierarchy that matters (year in school, grades)

Quantitative variable
Discrete variable
grouped later to make cata data, variable us measured originally as a number
integer, whole number, no fraction

Quantitative variable
continuous variable
variable measured as a number
any possible numerical value like ratio, interval

Population
Complete set of items that share at least one property in common that is the subject of a statistical analysis.

A summary measure of a population is a
parameter

A summary measure of a sample is a
statistic

Observational study
attempt to understand causeandeffect relationships. However, unlike experiments, the researcher is not able to control

Experimental study
An experiment is any process or study which results in the collection of data, the outcome of which is unknown. In statistics, the term is usually restricted to situations in which the researcher has control over some of the conditions under which the experiment takes place.

Explanatory (Independent) variable
In an experiment, the independent variable is thevariable that is varied or manipulated by the researcher

Response (dependent) variable
The dependent variable is the response that is measured. An independent variableis the presumed cause, whereas the dependent variable is the presumed effect.

Comparative study
Two or more groups compared to determine diff in value variable of interest

Descriptive study
characterize or ID certain attributes (Qualitative/categorical) Describes values of variable.

Confounding Variables
An extraneous variable. It correlates with both the dependent var and independent var

Spurious associations
Apparently but actually false misleadingly false. Apparent but not valid.

Independence
Randomness ensures independence. The inclusion of one indiv or even in a study does not affect the chance of an other indiv to be included in the study.

Randomness
Each possible sample was the same chance of being selected as any other sample.

Factors
explanatory variables

Treatments
levels of the factor

Control Treatment
No treatment, placebo

randomness
andomisation is the process by which experimental units (the basic objects upon which the study or experiment is carried out) are allocated to treatments; that is, by a random process

Replication
Within the experiment, different sample same procedure

Repeatability
External, repeating the experiment with new subjects and similar results

Categorical Variables  Frequency
The number of times the event occurred in an experiment or study.

Categorical Variables  frequency distributions
a table that displays the frequency of various outcomes in a sample. Each entry in the table contains thefrequency or count of the occurrences of values within a particular group or interval, and in this way, the table summarizes the distribution of values in the sample.

Relative frequency/Proportion
Relative frequency is another term for proportion; it is the value calculated by dividing the number of times an event occurs by the total number of times an experiment is carried out. The probability of an event can be thought of as its longrun relative frequency when the experiment is carried out many times.If an experiment is repeated n times, and event E occurs r times, then the relative frequency of the event E is defined to berfn(E) = r/n

Continuous quantitative variables  histogram
Created by constructing class intervals or bind which are equally sized ranges of values associated with the variable.

Accuracy
Central tendency (most likely to occur) (mean, median, mode)

Precision
How well a value is repeated (dispersion)

Uniform distribution
same in each bin

symmetric/unimodal distribution
Most freq occurring value

Rightskewed distribution
 Lowest value has the highest frequency.
 Positive trend.

Leftskewed distribution
 negative trend
 highest value has the highest frequency

Bimondal
two modes, two separate high frequencies

Bimodal skewed distribution
One monster mode and a mini mode.

central tendency
the most common numerical measure of central tendency, which is a measure of center of the distribution (mean average).

Medians (M)
The midpoint of a distribution, the number such that half of the observation are smaller and the other half are larger.

Relationship between shape and central tendency
 right skew  y>M
 left skew  y<M

Dispersion (variability)  range
MAxmin

Dispersion (variability)  quarile
3 quartiles that divides the number of observations into 4 equal parts
first quartile Q_{1} the median of observations bellow the median.
third quartile Q_{3} the median of obervations above the median.

Dispersion  Interquartile Range
Q_{3}  Q_{1} _{}

Dispersion variance (s^{2})
the average of the squares of deviations of the observations from their mean.

standard deviation (s)
 Square root of the variance
 Measures the spread of the mean

Outliers
The observations whose values to not follow the trend or bulk of the rest of the data.
 Values less than Q_{1}  1.5 (IQR)
 Values more than Q_{3} + 1.5 (IQR)

Probability
How often a specific event occurs in repeated samples or repeated processes.
P(E)=(number of ways even

The Binomial Distribution
 1. The experiment consists of n identical trials.
 2. The experiment consists of one of two outcomes.
 3. The probability success on a single trial is equal to pi and pi remains the same from trial to trial.
 4. The trials are independent; that is, the outcome of one trial does not influence the outcomes of any other trial.
 5. The random variable y is the number of success observed during the n trials.
p(y) = (n!/(y!(ny)!)) *pi ^{y} * (1pi) ^{(ny)}

The chisquare goodnessoffit
It test whether or not the viability observed in the frequencies is simply due to random chance or if it is improbable that the specified distribution is not the frequency distribution specified.
 1. Measurement is on at least a nominal scale.
 2. Categories or groups are mutually exclusive.
 3. Observations are independent.
 4. No category has a sample size restriction.

pvalue
The probability that, If the null were true, sampling variation would produce an estimate that is further away from the hypothesized value than our data estimate.
How likely it is to get that hypothesis if the null were true.

Chisquare test for independence
 a. Data are frequencies based on random sample.
 b. Samples are independent.
 c. value restriction
 2. null: There is no relationship, the variables are independent.
 alt. :There is a relationship, the two variables are dependent.
3. Epected

