Card Set Information
Business Stats vocab
science that deals with the analysis and classification of empirical data. It also attempts to draw conclusions based on
past or present experience.
totality of all data studied.
subset of data drawn from a population.
A numerical measurement referring to a
A numerical measurement referring to a sample.
The collection of data from every member of the population.
assumes different values. We use letters to indicate variables.
has a fixed value. It is the opposite of variable.
variable that assumes values depending on chance.
It assumes a finite number of values, or if it
assumes infinitely many values, the values can be counted using the counting
numbers 1, 2, 3 … etc.
It assumes infinitely many values that cannot be
counted. There are no gaps between the values.
or levels of measurement: Nominal
Non-numerical data such as names, labels, categories,
etc. They cannot be ordered.
or levels of measurement:
Like ordinal but differences make sense. There is no natural starting point, i.e.
there is no zero. Ratios are
meaningless. For example body
or levels of measurement:
They can be ordered, but differences either they
cannot be determined or they are meaningless, i.e. rating movies using stars.
or levels of measurement: Ratio
is the highest level of measurement for numerical data. There is a zero
starting point and differences and ratios are meaningful.
data, i.e. color, party or religious affiliation, etc.
data use either the nominal or ordinal scale of measurement.
Quantitative or Numerical:
data i.e. test scores, incomes figures, etc.
data use either the interval or ratio scale of measurement.
: A variable with
Statistical analysis for categorical variables is
limited to summarizing the data by category of computing the proportion of the
observations in each category.
: A variable with
The data can be manipulated mathematically and the
results are meaningful. For example we can add the data and divide by the
number of observations to arrive at the average value.
Data collected at one point in time.
For example a media research company calls up 5,000
households at random to determine the proportion of households tuned to NBC to
watch the opening ceremony of the 2012 Olympic Games.
Types of data: Time
Data collected at regular intervals over time.
Typical measuring points are months, for example monthly unemployment figures for the last three
years, quarters, for example company
quarterly reports for the last two years etc.
The best way to represent time series data is by a
For statistical studies first we must identify what we want to study. This is referred to as...
The variable of interest
We observe and measure specific characteristics, but we do not attempt to control or modify the subjects being studied. A Gallup poll is an example of an observational study.
all key terms
We conduct an experiment or as we say in Statistics,
we apply some treatment and then we observe its effects on the subjects. (The
subjects are usually called experimental units).
Pharmaceutical companies conduct such experiments
when they test new drugs.
One of the two parts of statistics: Descriptive Statistics
This part of Statistics attempts to summarize or describe the important characteristics of a set of data.
Methods of summarizing data include tables, pictures such as bar charts, pies, histograms, frequency polygons,
line-charts, etc, and numbers that
measure a specific characteristic of the data. For example, the mean or average
measures the center of a set of data.
One of the two parts of statistics: Inferential
Statistics or Statistical Inference
This part of Statistics attempts to make inferences or draw conclusions or generalizations
about a large population, based on a sample drawn from that population.
The tools used are based on Probability and Probability
Distributions and are extremely sophisticated.
The methods used in Statistical Inference have solid
Mathematical foundation and they will yield valid results provided of course
that the sample is representative of the population.
So the weak link in Statistical Inference is the sample and the sample size. Obviously a biased sample will yield unreliable
How to choose a “good” sample is a science in
Methods of sampling:Random sampling
Each member from the population has an equal chance of being selected.
sample of size n
possible sample of the same size n has an equal chance of being selected.
that there is difference between a random sample and a simple random sample.
the population into sub-populations or strata, and then draw a sample from each
: If the sample
selected from each stratum is a random sample, then this procedure, first
stratification and then random sampling is called stratified random sampling. This is a subgroup of stratified
a starting point then select a specified element, say the kth
the population into sections or clusters, choose a few clusters at random, and
then perform a census within each
selected cluster. This means select all
the elements from the chosen clusters.
special case of cluster sampling is area
sampling, where the clusters are geographic subdivisions.
choose data readily and conveniently available. This does not yield
statistically valid results.
A voluntary response sampling is one in which the
respondents themselves decide whether to be included or not.
a sample is flawed and should not be used for making general statements about a
Sampling schemes that combine several sampling methods are called multistagesamples.