Categorical Data
Use labels or names to identify categories of like items

Quantitative data
Are numerical values that indicate how much or how many

Data visualization
Term used to describe the use of graphical displays to summarize and present information about a data set

Frequency Distribution
A tabular summary of data showing the number (frequency) of observations in each of several nonoverlapping categories or classes

Relative & percent frequency
def. & equation
 The fraction or proportion of observations belonging to a class.
 rel freq.=frequency of the class / n
 n=total observations
 % freq=relative frequency * 100

Relative & percent frequency distribution
Gives a tabular (table) summary of data showing the relative frequency for each class

Bar Charts
graphical display for depicting categorical data summarized in a frequency, relative frequency, or percent frequency distribution. The later stated information is usually on the vertical axis and the classes on the horizontal axis

Pie Charts
Graphical display for presenting relative frequency and percent frequency distribution for categorical data. Divides a circle into sectors to represent relative freq or % freq distribution

3 steps necessary to define the classes for a frequency distribution with quantitative data:
 1. Determine the number of nonoverlapping classes.
 2. Determine the width of each class.
 3. Determine the class limits

Number of classes
General guideline recommends using between 5 and 20 classes. Dependent of the number of data items

Width of classes
guide & equation
 general guideline is that the width be the same for each class.
 Approx. class width= Largest data value  smallest data value / number of classes

Class limits
An upper and lower class limit must be chosen so that each data item belongs to one and only class.

Dot plot
A graphical summary of data with a horizontal axis showing the range for the data. Each data value is represented by a dot placed above the axis

Histogram
Graphical display of quantitative data with the frequency, relative frequency, or % freq. dist. on the vertical axis. The variable of interest on the horizontal axis. Unlike bar chart because it contains no separation between the classes

Cumulative Distributions
Tabular summary of quantitative data using the number of classes, class width, and class limits developed for the frequency distribution showing the data items with values less than or equal to the upper class limit of each class

Cumulative Relative Frequency Distribution
 Shows the proportion of data items
 cum. rel. freq= cum freq / total items n

Cumulative percent frequency distribution
Shows the percentage of data items with values less than or equal to the upper limit of each class

Stem and leaf display
A graphical display used to show simultaneously the rank order and shape of data.

2 Primary advantage of stem and leaf display
 1. easier to construct by hand
 2. within a class interval more information provided than the histogram because the actual data is shown

Crosstabulation
tabular summary of data for two variables

Simpson's paradox
 The reversal of conclusions based on aggregate and unaggregated data.
 conclusions drawn from two or more separate crosstabulations can be reversed when the data are aggregated into a single crosstabulation.