Home > Preview
The flashcards below were created by user
larry.gish89
on FreezingBlue Flashcards.

Statistics:
 The
 science that deals with the analysis and classification of empirical data. It also attempts to draw conclusions based on
 past or present experience.

Population
 The
 totality of all data studied.

Sample
 A
 subset of data drawn from a population.

Parameter
 A numerical measurement referring to a
 population.

Statistic:
A numerical measurement referring to a sample.

Census:
The collection of data from every member of the population.

Variable
 It
 assumes different values. We use letters to indicate variables.

Constant
 It
 has a fixed value. It is the opposite of variable.

Random Variable
 A
 variable that assumes values depending on chance.

Discrete
Variable
 It assumes a finite number of values, or if it
 assumes infinitely many values, the values can be counted using the counting
 numbers 1, 2, 3 … etc.

Continuous
Variable
 It assumes infinitely many values that cannot be
 counted. There are no gaps between the values.

Scales
or levels of measurement: Nominal
 Nonnumerical data such as names, labels, categories,
 etc. They cannot be ordered.

Scales
or levels of measurement:
Interval
 Like ordinal but differences make sense. There is no natural starting point, i.e.
 there is no zero. Ratios are
 meaningless. For example body
 temperatures.

Scales
or levels of measurement:
Ordinal
 They can be ordered, but differences either they
 cannot be determined or they are meaningless, i.e. rating movies using stars.

Scales
or levels of measurement: Ratio
 This
 is the highest level of measurement for numerical data. There is a zero
 starting point and differences and ratios are meaningful.

Types
of data:
Categorical or
Qualitative:
 Nonnumerical
 data, i.e. color, party or religious affiliation, etc.
 Categorical
 data use either the nominal or ordinal scale of measurement.

Types
of data:
Quantitative or Numerical:
 Numerical
 data i.e. test scores, incomes figures, etc.
 Quantitative
 data use either the interval or ratio scale of measurement.

Types
of data:
Categorical
variable
 : A variable with
 categorical data.
 Statistical analysis for categorical variables is
 limited to summarizing the data by category of computing the proportion of the
 observations in each category.

Types
of data:
Quantitative
variable
 : A variable with
 numerical data.
 The data can be manipulated mathematically and the
 results are meaningful. For example we can add the data and divide by the
 number of observations to arrive at the average value.

Types
of data:
Crosssectional
data
Data collected at one point in time.
 For example a media research company calls up 5,000
 households at random to determine the proportion of households tuned to NBC to
 watch the opening ceremony of the 2012 Olympic Games.

Types of data: Time
series data
Data collected at regular intervals over time.
 Typical measuring points are months, for example monthly unemployment figures for the last three
 years, quarters, for example company
 quarterly reports for the last two years etc.
 The best way to represent time series data is by a
 line graph.

For statistical studies first we must identify what we want to study. This is referred to as...
The variable of interest

Observational statistics:
We observe and measure specific characteristics, but we do not attempt to control or modify the subjects being studied. A Gallup poll is an example of an observational study.


Experimental statistics:
 We conduct an experiment or as we say in Statistics,
 we apply some treatment and then we observe its effects on the subjects. (The
 subjects are usually called experimental units).
 Pharmaceutical companies conduct such experiments
 when they test new drugs.

One of the two parts of statistics: Descriptive Statistics
This part of Statistics attempts to summarize or describe the important characteristics of a set of data.
 Methods of summarizing data include tables, pictures such as bar charts, pies, histograms, frequency polygons,
 linecharts, etc, and numbers that
 measure a specific characteristic of the data. For example, the mean or average
 measures the center of a set of data.

One of the two parts of statistics: Inferential
Statistics or Statistical Inference
 This part of Statistics attempts to make inferences or draw conclusions or generalizations
 about a large population, based on a sample drawn from that population.
 The tools used are based on Probability and Probability
 Distributions and are extremely sophisticated.
 The methods used in Statistical Inference have solid
 Mathematical foundation and they will yield valid results provided of course
 that the sample is representative of the population.
 So the weak link in Statistical Inference is the sample and the sample size. Obviously a biased sample will yield unreliable
 results.
 How to choose a “good” sample is a science in
 itself.


Methods of sampling:Random sampling
Each member from the population has an equal chance of being selected.

Methods
of sampling:
Simple random
sample of size n
 Every
 possible sample of the same size n has an equal chance of being selected.
 Notice
 that there is difference between a random sample and a simple random sample.

Methods
of sampling:
Stratified
 Divide
 the population into subpopulations or strata, and then draw a sample from each
 stratum.
 Note: If the sample
 selected from each stratum is a random sample, then this procedure, first
 stratification and then random sampling is called stratified random sampling. This is a subgroup of stratified
 sampling.

Methods
of sampling:
Systematic
 Choose
 a starting point then select a specified element, say the kth
 element.

Methods
of sampling:Cluster
 Divide
 the population into sections or clusters, choose a few clusters at random, and
 then perform a census within each
 selected cluster. This means select all
 the elements from the chosen clusters.
 A
 special case of cluster sampling is area
 sampling, where the clusters are geographic subdivisions.

Methods
of sampling:
Convenient
 Just
 choose data readily and conveniently available. This does not yield
 statistically valid results.

Methods
of sampling:
Voluntary
Response
 A voluntary response sampling is one in which the
 respondents themselves decide whether to be included or not.
 Such
 a sample is flawed and should not be used for making general statements about a
 population.

Methods
of sampling:
Multistage Sampling
Sampling schemes that combine several sampling methods are called multistagesamples.

