# AP Stats: Chapter 1

The flashcards below were created by user Gymnastxoxo17 on FreezingBlue Flashcards.

1. statistics
• getting information out of numerical data gotten from an experiment or from a sample
• creating the experiment or sampling procedure, collecting and analyzing data, and making inferences (statements) about the population
2. descriptive statistics
methods for organizing, displaying, and describing data by using tables, graphs, and summary measures
3. inferential statistics
methods that use sample results to help make inferences (decisions or predictions) about a population
4. data analysis
process of describing data using graphs and numerical summaries
5. individuals
objects described by a set of data; may be people, animals, or things
6. variables
any characteristic of an individual
7. categorical variable
places an individual into one of several groups or categories; can be numerical in some cases (zip codes, classes of age)
8. quantitative variable
takes numerical values for which it makes sense to find an average, should always specify the unit
9. distribution
tells what values a variable takes and how often it takes these values
10. inference
drawing conclusions that go beyond the data at hand
11. frequency table
displays the count (frequency) of observations in each category or class
12. relative frequency table
shows the percents (relative frequencies) of observations in each category or class
13. roundoff error
the difference between the calculated approximation of a number and its exact mathematical value
14. pie chart
• shows the distribution of a categorical variable as a "pie" whose slices are sized by the counts or percents for the categories
• must include all of the categories that make up the whole
15. when can you not use pie charts
• if you don't have all the categories that make up the whole
• if you're dealing with individuals that represent a category (e.g. 10-12yrs) since those are different groups, not part of a whole
16. bar graph
used to display the distribution of categorical variable or to compare the sizes of different quantities. The categories or quantities being compared is on the horizontal axis. Has blank spaces between the bars.
17. how can graphs be misleading
• bars with different widths
• x-axis and y-axis intervals
18. two-way table
table of counts that organizes data about two categorical variables
19. marginal distribution
• distribution of values in one of the categorical variables in a two-way table among all of the individuals described in the table
• in a two-way table, calculating percentages of the distribution of one variable
• say nothing about the relationship between two variables
20. conditional distribution
• describes the values of one variable among individuals who have a specific value of another variable
• percentage of distribution calculated between the two variables in a two-way table
21. segmented bar graph
• compares the distribution of a categorical variable in each of several groups. There is a bar for each group with segments that correspond to the different values of the categorical variable.
• height of each segment is determined by the percent of individuals in the group with that value, each bar has a total height of 100%
22. four steps to answer a statistics problem
• STATE the question you want to answer
• PLAN how you will answer the question and which statistical techniques the problem requires
• DO make graphs and calculate stuff
• CONCLUDE be practical given the setting of the real-world problem
23. side by side bar graph
• used to compare the distribution of a categorical variable in each of several groups. There is a bar corresponding to each group for each categorical variable.
• height of each bar is determined by the count or percent of individuals in the group with that value
24. association
occurs between two variables if specific values of one variable tend to occur in common with specific values of the other
25. qualitative data
values of categorical data
26. dotplot
a simple graph that shows each data value as a dot above its location on a number line
27. overall pattern
• in any graph of data, this can be describes by the direction, form, and strength of the relationship
• SOCS: shape, outliers, center, and spread
28. center
the midpoint/median represents the typical value, and the calculated mean is the average
indicates the variability of the data, includes the maximum and minimum values and the range
30. range
maximum-minimum values
31. outlier
an observation that lies outside the overall pattern of other observations
32. residuals
in outliers, residuals are present if outliers are outliers in the y direction but not the x direction
33. shape
• peaks (modes) and the number of which
• skewed results or symmetry
• number of clusters + gaps
34. mode
the value or class in a statistical distribution having the greatest frequency
35. unimodal
describes a graph of quantitative data with a single peak
36. bimodal
describes a graph of quantitative data with two clear peaks
37. multimodal
describes a graph of quantitative data with more than two clear peaks
38. symmetry
left and right sides of the graph are approximately mirror images of each other
39. skewed to the right
right side of the graph is much longer than the left side, tail is extended to the right
40. skewed to the left
left side of the graph is much longer than the right side, tail is on the left
41. stemplot
observations are separated into stems (numbers that have all but final digit) and leaves (the final digit), arranged in a vertical column with increasing order out of the stem (down)
42. splitting stems
• a method for spreading out a stemplot that has too few stems
• should use asterisks (e.g. 5* and 5**)
43. back-to-back stemplot
used to compare the distribution of a quantitative variable for two groups, one variable is a leaf on one side of the stem and the other variable is a separate leaf on the other side of the stem
44. truncate
removing one or more digits from a value if it has too many digits, like in creating stemplots
45. histogram
type of bar graph without spaces that displays the class/relative frequency of a quantitative variable; horizontal axis shows the classes of the variable, vertical axis has the scale of counts/percents; do not preserve raw data because it has been grouped into classes
46. time plots
used to show bivariate (2-variable quantitative data) where the independent variable (x) represents time
47. independent/dependent variable on graph axes
• dependent=y-axis
• independent=x-axis
48. mean formula
49. mean
arithmetic average, non-resistant measure, represents size of observations if they were equally split among all observations
50. resistant measure
statistic that is not affected very much by extreme observations
51. median
midpoint M of a distribution, half the observations are smaller than this and half are larger, represents typical value, resistant measure
52. median position formula

• n=# observations in data set
• after arranging data in increasing order, move this number inward to find median
53. mean > median
right skewed
54. mean = median
symmetric
55. mean < median
left skewed
56. mode
value that occurs the most
57. 68-95-99.7 Rule aka Empirical Rule
in a bell-shaped distribution, 68% of the data lies within one standard deviation of the mean, 95% lies within two standard deviations of the mean, and 99.7% lies within three standard deviations of the mean
58. interquartile range (IQR)
• measures the range of the middle 50% of the data, resistant measure
• IQR= Q3-Q1
59. first quartile
median of observations to the left of the median
60. third quartile
median of observations to the right of the median
61. percentile implication
95th percentile means that 95% of the population got that score or lower
62. IQR rule for calculating outliers
an observation is an outlier if it falls more than 1.5 x IQR above the third quartile or below the first quartile
63. how to use IQR to calculate bottom cutoff value
Q1-1.5 x IQR
64. how to use IQR to calculate top cutoff value
Q3+1.5 x IQR
65. standard deviation
• measure of spread that looks out how far observations are from the mean, typical scores are found above and below the standard deviation of the mean, non-resistant measure
• standard deviation of 0 indicates no variability, greater when observations are more spread out
66. degrees of freedom
(n-1) observations
67. variance
Sxthe average squared distance of the observations in a data set from their mean
68. standard deviation formula
69. variance formula
70. how to calculate variance and standard deviation
• find mean of data, find the deviations of the observations from the mean, square these, and add them up, then divide by degrees of freedom (n-1) observations to find the variance
• to find standard deviation, take the square root of variance
71. five-number summary
• minimum, first quartile, median, third quartile, maximum
• gives a summary of both center and spread, roughly divides the distribution into quarters
72. boxplot
graphs the five-number summary, box spans the quartiles and whiskers extend to the min/max values, center line represents median
73. modified boxplots
boxplots that always show the outliers as dots
74. side-by-side boxplots
show the boxplots next to each other using the same scale, used to compare distributions of two data sets
75. detecting skewedness in boxplots
the longer whisker is where the distribution is skewed, a larger difference in lengths means a more strongly skewed distribution
76. detecting range and IQR in boxplots
range is represented by full length of boxplot, IQR is represented by length of box
77. options for measuring center and spread, resistant or non-resistant
• median and IQR are resistant, use when analyzing skewed data and/or outliers
• average and standard deviation are non-resistant and sensitive to skewed results and outliers
78. sigma
Σ represents a summation, "add them up"
79. index
variable i
80. lower limit and upper limit
the numbers above and below a sigma, represent the range of numbers you are plugging into i and adding up
81. summand
in sigma notation, what you're adding up (e.g. i2)
82. solution
83. bar graph
84. two-way table
85. marginal distribution
86. conditional distribution
87. segmented bar graph
88. dotplot
89. back-to-back stemplot
90. stemplot
91. frequency table categories
class and count
92. relative frequency table categories
class and percent
93. frequency histogram; relative frequency histogram
94. boxplot
95. side-by-side boxplot
96. side-by-side bar graph
 Author: Gymnastxoxo17 ID: 240663 Card Set: AP Stats: Chapter 1 Updated: 2013-10-16 01:58:21 Tags: Exploring Data Folders: Description: d Show Answers: