The flashcards below were created by user
on FreezingBlue Flashcards.
Involve numeric characteristics such as age, height, sales revenue or business profits and so on
Involve non numeric characteristics such as race, gender or hair color
Characteristics of a population
Statistics that determine something about an entire group (population) based on looking at part of the group (sample)
Statistics that organize, summarize and present data
a change in one variable will cause a change in another variable
Variables appear to have a relationship to each other but one variable does not necessarily cause a change in the other item
The science of gathering, organizing analyzing and presenting data
A portion or a subset from a population
The entire set of items, people or measurements that are studied
When the occurrence of one event prevents the other events from happening
All observations will be placed in a category. There are no other options
What are the two types of quantitative variables?
Discrete and continuous
Can only assume a specific value. Ex. Population of a city. Discrete variables are counted.
A variable that can be any value with in a certain range such as weight interest rates
Levels of measurements
Used to classify data. Also called scales of measurement. Indicates how data is calculated summarized measured and tested
What are the 4 levels of measurement?
Nominal ordinal interval ratio
What 2 levels of measurement are used in qualitative variables?
What 2 level of measurements are used in quantitative variables?
Interval and ratio
Nominal level data
Mutually exclusive and exhaustive and no logical sequence. Classified and counted. Ex number if boys and girls in a class
Ordinal level data
Mutually exclusive and exhaustive and can be ranked or ordered. Ex grades a b c d f
Interval level data
Mutually exclusive and exhaustive can be ranked or ordered difference between classifications is a consistent unit of measure zero does not mean nothing is present. Ex: dress size and temp
In a population it is the sum of all the values divided by the number of items. In a sample it is the sum of the value of items selected divided by the number if items. Ratio level data uses the arithmetic mean to represent the center
Characteristics of a Arithmetic Mean
A mean uses all values in a sample or a population
A mean might be distorted by large or small values called outliers
All interval and ration data have a mean
Each set of data has a single unique mean
If you sum each item's deviation from the mean, it equals zero
Ratio level data
Mutually exclusive and exhaustive can be ranked or ordered. consistent unit of measurement. zero means none. ratio between 2 classifications is meaningful. Ex: gross pay hours worked test scores
A computation of the arithmetic mean used whrn you have multiple observations of the same value in the population or sample.
The midpoint of the values after they have been arranged in order from smallest to largest or largest to smallest. Ordinal data uses the median to represent the center
A special form of mean used in a situation in which you are computing averages that compound on each other or you want to compute the rate of change on an item over time.
The value of an observation that ocurs most frequntly. Nominal data uses the mode to represent the center.
The population mean is the mean of all the values in a population. The formula is expressed as follows:
- μ = Population mean
- Σ = Sum
- X = Population value
- N = Number of values in the population
Characteristics of the Median
It is not affecte by extremely high or extremely low scores
Each set of data has a single median
It can be computed on ratio=level, interval-level, and ordinal level data
Characteristics of a mode
It is used with all types of data
It is not afffected by extremely high or low values
If there are no reoccuring values, there is no mode
It is possible to have multiple modes
In a symmetrical distribution, the arithmetic mean, median, and mode are equal. You can use any one of these measures to represent the center.
A symmetrical distribution is where the histogram has the same shape on each side of the center point. In other words, if you cut the histogram in half at the center point, you get two identical pieces. Not all distributions are symmetrical
Positively Skewed Distribution
The arithmetic mean is larger than the median or the mode due to one or more large values
Negatively skewed distributions
The arithmetic mean is smaller than the median or the mode due to one or more small values
Measures of dispersion
Measures of dispersion tell you about the spread in the data. There are 4 measures of dispersion: range, mean deviation, variance and standard deviation.
The difference between the highest and lowest values in the data
The arithmetic mean of the absolute values of the deviation of each observation from the arithmetic mean
The arithmetic mean of the squared deviations of the observations from the mean
The square root of the variance
P.L. Chebyshev was a mathematician who developed a theory regarding standard deviation. His theory states that for any population or sample, the percentage of values that lie within k plus and minus standard deviations of the mean is at least:
- k = Number of standard deviations
- Note: For this theorem to work, k must be greater than one.
According to the theorem, at least 75% of this data falls within two standard deviations of the mean.
The empirical rule (also called the normal rule) applies only to a symmetrical, bell-shaped distribution
Approximately 68% of the data is within one plus or minus standard deviation from the mean.
Approximately 95% of the data is within two plus or minus standard deviations from the mean
Approximately 99.7% of the data is within three plus or minus standard deviations from the mean.