An arrangement or sequence of events/objects/elements in which order matters. Ex. Timetable, combination lock
A grouping or set of events/objects/elements in which order does not matter.Ex. Forming a commitee
The Fundamental Counting Principle
Used to determine the number of combinations that can be formed with a set of elements.
Principle of Inclusion and Exclusion
Used to determine the number of elements in two or more sets combined.
Used when two or more items must be together.
When a set is too hard to count directly, count its complement and subtract it from the universal set.
For some problems, separate into cases and add cases together.
A collection of items called elements.
A set all of whose elements are contained in the original set.
An all encompassing set.
A set of all elements that are in the universal set but not in set A.
Disjoint or Mutually Exclusive
When two sets have no elements in common. (Every set and its complement are disjoint)
Includes all elements of both sets.
Only includes elements common to both sets.
The Empty/Null Set
A set containing no elements.
Cardinality of a Set
The number of elements in the set.
Multiplying a series of descending natural numbers.
Used to determine the number of permutations of r items selected from n items.
Elements that are exactly the same. We do not count them as distinct elements.
Rule of Sum
(OR) Count using the Principle of Inclusion and Exclusion or add sets together if they are mutually exclusive.
Rule of Product
(AND) Count using the Fundamental Counting Principle - Multiply
A branch of mathematics that investigates through experiment, calculation, and reasoning the likelihood of specified events.
- A measure of the likelihood of an event or outcome.
- The chance that an event or outcome will occur.
A well defined process consisting of a number of trials in which clearly defined outcomes are observed.
Possible result of a single trial.
Set of all possible outcomes of an experiment. (S or U)
A one time through process of an experiment.
One or more outcomes (can be grouped).
An estimate based on experience or intuition.
Conduct an experiment with n trials in order to find the probability of an event A.
Probability is calculated, not experimentally determined. Assumes all outcomes are equally likely.
Results are false or a certain result is exaggerated due to a small number of trials.
A comparison of the probabilities of an event occurring to an event not occurring.
Multiple events occurring.
When the outcome/occurrence of one event does not affect the outcome/occurrence of another.
When the outcome/occurrence of one event affects the outcome/occurrence of another.
Mutually Exclusive Events
Events that cannot occur at the same time (no intersection).
Non-Mutually Exclusive Events
Events that can occur at the same time (possible intersection).
A distribution of probabilities of all possible outcomes of an experiment.
Random Variable (X)
Variable that represents all possible outcomes of an experiment.
Individual value of X.
Values are separate and distinct. Finite number in an interval.
Values are all real numbers. Infinite number of values in an interval.
Uniform Probability Distribution
A distribution of probabilities with equally likely outcomes.
Non-Uniform Probability Distribution
Not all outcomes have the same probability.
The "average" outcome.
The Binomial Distribution
Used for experiments involving repeated trials of independent events which can be classified as success or failure.
Used to find the probability of x failures before the first success in an experiment. Requires independent events that can be classified as success or failure and repeated trials until the first success.
A probability distribution for experiments. in which trials are not independent.
Measures of Central Tendency
Mean, Median, Mode
The average value
The middle value.
The most frequent value
The data only has one peak/mode
Values are distributed symmetrically around the mean. The mean, median, and mode are the same.
Distribution Skewed Left
There are more values on the right side. Goes from left to right: mean, mode, median (Negatively Skewed).
Distribution Skewed Right
There are more data values on the left side. Goes from left to right:median, mode, mean.(Positively Skewed)
The average distance of a datum from the mean of a data set.
Models continuous data that is distributed unimodally and symmetrically about the mean.
Standard Normal Distribution
A normal distribution with a mean of 0 and standard deviation of 1.
Altering an interval in order to include a certain value.
A measure of dispersion of the data in a data set.
The distance of an individual datum in a data set from the mean.
The number of standard deviations a value is away from the mean.
A branch of mathematics that deals with the gathering, organization, analysis, interpretation, and presentation of numerical information.
The original unprocessed information collected by the researcher.
When the variable takes on category types.
A graph that measures the frequencies of categorical data.
A graph that measures the frequencies of numerical data.
A polygon that connects the midpoints at the top of each bar of a histogram.
Culmulative Frequency Diagrams
Shows the frequency at a value and all of the values below it.
Original data that is gathered by the researcher.
Found data that the researcher uses which was gathered by others.
An entire group of individuals being studied.
A subgroup of the population.
The group of individuals that actually have a chance of being chosen for the sample.
When every individual in the population has an equal chance of being chosen for the sample.
Simple Random Sample
Sample members are selected from a random simulation.
Systematic Random Sample
When the researcher goes through the population sequentially and selects members at regular intervals.
The population is divided into stratums and the number people in each stratum from the sample is proportional to the number of people in each stratum in the population.
When one or more groups are chosen for the sample that are likely to be a good representation of the population.
Multi Stage Sampling
Various random samples are done to chose groups and subgroups of a population until arriving at the sample members.
Voluntary Response Sampling
When the researcher simply invites members of the population to participate in the study.
A sample is chosen that is easily accessible.
When a small sample is surveyed, and sample members are asked to pass along the survey to their friends, and them to their friends to get a larger sample.
When the researcher uses his of her judgement to chose members he or she believes will be appropriate for the study.
Ad Hoc Quotas
A quota is established and the researcher can choose anyone who fits the quota for the sample.
The tendency of a factor to favour certain outcomes or responses which systematically skews results.
The sampling frame is not an accurate representation of the entire population.
Certain groups are under represented because they choose not to participate in the study.
Data collection method systematically over estimates or underestimates a certain characteristic which skews results.
2. Leading/Loaded questions.
Participants deliberately give false or misleading responses which skews results.
Voluntary Response Bias
Sample members are self selected which over represents those with strong opinions.
Data values that are distant from the majority of the data.
The difference between the highest data value and the lowest data value. Very sensitive to outliers.
Mean Absolute Value Deviation
Similar to standard deviation, measures the average distance of a datum from the mean. However MAD is less accurate than standard deviation.
Values that divide the data into four sections that each have an equal number of data values.
The range of the middle half of the data. Includes 50% of the data around the mean.
Half of the interquartile range. Includes 25% of data around the mean.
Box and Whisker Plot
A visual representation of a data set to illustrate quartiles.
Modified Box and Whisker Plot
A box and whisker plot with outliers. The outliers are plotted outside of the whisker.
Similar to quartiles but the data is divided into 100 intervals with an equal number of data values in each interval.
Two Variable Statistics
Provides ways to detect relationships between variables and develop mathematical models for them.
Addressing the relationship between variables.
When the changes in x are proportional to the changes in y.
Positive Linear Correlation
As x increases, y increases.
Negative linear correlation
As x increases, y decreases.
Line of Best Fit
Line on a scatterplot that shows the pattern and direction of the points. Line that is the closest to the plotted points. Passes through as many points as possible with the remaining points grouped evenly above and below the line. Can be used to make predictions about values that are not given or recorded.
Predicting a value that is within the range of the plotted points.
Predicting a value beyond the range of the plotted points.
When the plotted points lie very close to the lbf.
When the points are more dispersed but still form a rough line.
When the points lie exactly on the lbf.
A measure of the strength of a correlation between two variables. Depends on units.
A Quantitative measure of the strength of a linear correlation. How closely the points cluster around the lbf.
An analytical technique used to determine the relationship/model/equation between two variables.
Find the linear correlation
Refers to finding the equation of the lbf.
Perform a linear regression
Refers to the process of finding the equation of the lbf.
The positive or negative vertical distance between a data value and the lbf.
An analytical technique for finding the equation of the curve of best fit.
Cause and effect relationship
A change in x produces a change in y.
Common cause factor
An external variable changes both variables in the same way.
Reverse cause and effect relationship
The independent and dependent variables are reversed in the process of establishing causality.
A correlation exists with no causal relationship. Variables are completely unrelated to each other.
A correlation exists with no causal relationship. Variables are related to each other, but is difficult to find a common cause factor or cause and effect relationship.
External variables that can influence results and the relationship. May influence independent or dependent variable.
Double Blind Study
Neither the participants nor the researchers know who is in which group.
Coefficient of Determination
A general measure of how well a specific regression curve fits the data.