# Math 219: Sec 3.4 - 5.1

### Card Set Information

 Author: lazvertiigo ID: 284089 Filename: Math 219: Sec 3.4 - 5.1 Updated: 2014-10-02 16:51:09 Tags: scc statistics stewartIII Folders: Description: SCC Stats And Probability using Stewart III textbook Show Answers:

Home > Flashcards > Print Preview

The flashcards below were created by user lazvertiigo on FreezingBlue Flashcards. What would you like to do?

1. What represents the distance of a data value from the mean in terms of standard deviations.
z-score
2. How do you calculate the population z-score?
z=(x-μ)/σ
3. How do you calculate the sample z-score?
4. What is unitless, has mean = 0, and standard deviation =1 ?
z-score
5. The median is a special case of a general concept called the _______.
percentile
6. In a set of data, what is a value such that k percent of the observations are less than or equal to the value.
kth percentile
7. Define Quartiles.
Quartiles divide the data sets into quarters or fourths. The quartiles are the 25th , 50th , and 75th percentiles, where Q1=25th percentile, Q2= 50th percentile, i.e.m the median, and Q3= 75th percentile.
8. Identify the steps to finding quartiles.
Step 1. Arrange the data in increasing order.

Step 2. Determine the median, M, or second quartile, Q2.

Step 3. Divide the data set into halves: the observation below (to the left of) M and the observatin above M. The first quartile, Q1, is the median of the bottom half and the third quartile, Q3, is the median of the top half.
9. Quartiles, on the other hand , are ________ to extreme values
resistant
10. What is the range of the middle 50% of the observations in a data set? In other words, it is the difference between the third and first quartile.
IQR, The Inner Quartile Range
11. What is the formula for IQR?
IQR = Q3 - Q1
12. What are extreme observations in the data?
Outliers
13. Why should outliers be investigated?
Outliers should be investigated because outliers could be chance occurrence, measurement errors, data entry errors, or sampling errors. Know that outliers are not necessarily invalid data, but should be recognized.
14. Describe the steps in Checking for Outliers by Using Quartiles.
Step 1. Determine the first and third quartiles of the data.

Step 2. Compute the interquartile range. The interquartile range or IQR is the difference between Q3 and Q1.

Step 3. Determine the fences: The Lower Fence and The Upper Fence.

Step 4. Values less than the lower fence or more than the upper fence could be considered outliers.
15. What is the formula for the Lower Fence?
LF=Q1-1.5(IQR)
16. What is the formula for the Upper Fence?
UF=Q3-1.5(IQR)
17. What is the collection of the smallest value, the first quartile (Q1or P25) , the median (Q2 or P50), the third quartile (Q3or P75), and the largest value?
The Five-Number Summary
18. What kind of graph can illustrate the five-number summary?
A Boxplot.
19. Describe the distribution:
Skewed Right
20. Describe the distribution:
Symmetric
21. Describe the distribution:
Skewed Left
22. Data for a single variable is called _____.
univariate data
23. In a data set, the relations between two variables is called _______.
bivariate data
24. What is the variable whose value can be explained by the value of the explanatory or predictor variable?
response variable
25. What is a graph that shows the relationship between two quantitative variables?

Each individual is represented by a point in the diagram: the explanatory variable, x, is plotted on the horizontal scale and the response variable, y, is plotted on the vertical scale.
A Scatter Diagram
26. Explain why it is necessary to show a scatter diagram with the correlation coefficient when claiming that a linear relation exists between two variables.
Influential observations can cause the correlation coefficient to increase substantially, thereby increasing the apparent strength of the linear relation between two variables.
27. positive linear association
above average values of one variable are associated with the above average values of the other and below average values of one variable are associated with below average values of the other. In other words, two variables are positively associated if,whenever the value of one variable increases, the value of the other variable also increases.
28. negative linear association
above average value of one variable are associated with below average values of the other and below average values of one variable are associated with above average values of the other. In other words, two variables are negative associated if whenever the value of one variable increases, the value of the other variable decreases.
29. What is a measure of strength of linear relation between two quantitative variables?
linear correlation coefficient
30. r is always between _and _inclusive.
r is always between -1 and +1 inclusive. That is −1≤r≤1.
31. If r=+1, then a ____ _____ ____ relation exists between the two variables.
If r=+1, then a perfect positive linear relation exists between the two variables.
32. If r=−1, then a ____ _____ _____ relation exists between the two variables.
If r=−1, then a perfect negative linear relation exists between the two variables.
33. Positive values of r correspond to evidence of ____ ______ between the two variables.
Positive values of r correspond to evidence of positive association between the two variables.
34. Negative values of r correspond to evidence of _____ _______ between the two variables
Negative values of r correspond to evidence of negative association between the two variables
35. If r is close to 0, then_____ ___ _____ exits of a linear relation between the two variables.
If r is close to 0, then little or no evidence exits of a linear relation between the two variables.
36. T/F:  r close to 0 does not imply no relation, just no linear relation.
True
37. r is a _______ measure.
r is a unitless measure.
38. r is not _________. Therefore, an observation that does not follow the overall pattern of the data could affect the value of the linear correlation coefficient
r is not resistant. Therefore, an observation that does not follow the overall pattern of the data could affect the value of the linear correlation coefficient
39. Define r for the following graph:
Perfect positive linear relation, r=1
40. Define r for the following graph:
Strong positive linear relation, r ≃ 0.9
41. Define r for the following graph:
Moderate positive linear relation, r ≃ 0.4
42. Define r for the following graph:
Perfect negative linear relation, r = -1
43. Define r for the following graph:
Strong negative linear relation, r ≃ -0.9
44. Define r for the following graph:
Moderate negative linear relation, r ≃ -0.4
45. Define r for the following graph:
No linear relation, r close to 0
46. Define r for the following graph:
No linear relation, r close to 0
47. T/f: Correlation does not imply causation.
T: Just because two variables are correlated does not mean that one causes the other to change. One way that two variables can be related even though there is not a causal relation is through a lurking variable.
48. What is the line that describes the linear relationship best.
regression line
49. What is the difference between the observed values and the predicted values.
Residuals or errors.
50. least-squares regression line
the line that minimizes the sum of the squared errors
51. Identify the equation:
Least-squares regression line
52. For , b1 represents...
53. What is the difference between the observed values and the predicted values?
Residuals
54. What is the formula for Residuals?
• Residuals = observed y - predicted y
• or Residuals = y - y^
55. What is true about the point that is always contained in the least-squares regression line?
56. Interpreting slope of a regression line:
For an additional increase in unit of an explanatroy variable, the response variable will increase or decrease on average by b1.
57. Interpretation of the y-intercept:
We interpret the y-intercept as being the value of the response variable when the value of the explanatory variable is 0. Sometimes the y-intercept will not make sense and will not be interpretable, but it is still needed in the model.
58. What is the use of a regression line for predictions outside the range of x values used to obtain the line?
Extrapolation
59. Why is it highly advised to only use the regression model to make prediction within the given range of data?
Making predictions using values that is outside those observed from the data can be very dangerous in practice. We cannot be certain of the behavior of the data for which we have no observations.
60. Of what is this an example?
Extrapolation: the model is being used to predict values outside of the observed data.
61. What does R2 represent?
coefficient of determination
62. What measures the percent of total variation in the response variable that is explained by the least-squares regression line?
The coefficient of determination R2, which measures how well y^ describes the relationship between the two variables.
63. R2 is close to 0 indicates_______.
a model with very little explanatory power
64. R2 is close to 1 indicates_______.
a model with much explanatory power
65. Why do we analyze residuals?
• 1) To determine if the linear model is appropriate
• 2) To determine whether the variance of the residuals is constant
• 3) To check for outliers
66. T/F: if a correlation indicates a linear relation exists between two variables does not imply that the relation is linear.
True
67. To determine if a linear model is appropriate we need to also draw a __________ ________.
To determine if a linear model is appropriate we need to also draw a residual plot.
68. What is a scatter diagram with the residuals on the vertical axis and the explanatory variable on the horizontal axis?
Residual Plot
69. T/F: If a plot of the residuals against the explanatory variable show any discernible pattern, such as curve, then the explanatory and the response variable may not be linearly related.
True
70. T/F: If residuals are scattered randomly around 1, chances are your data fit a linear model.
• False:
• If residuals are scattered randomly around 0, chances are your data fit a linear model.
71. What is constant error variance or homoscedasticity?
If a plot of the residuals against the explanatory variable shows the spread of the residuals increasing or decreasing as the explanatory variable increases, then a strict requirement of the linear model is violated.
72. Of what is this an example?
Homoscedasticity
73. Of what is this an example:
contingency tables or two-way tables.
74. What is the first step in analyzing contingency tables?
The first step in analyzing contingency tables is to analyze each of the variables separately. Analyzing therow variables by themselves and analyzing the column variable by themselves. The variables, when analyzed separately, have their marginal distributions.
75. How do we compute the relative frequency marginal distributions?
Divide the row marginal frequencies by the grand total to get the row relative frequency marginal distribution and divide the column marginal frequencies by the grand total to get the column relative frequency marginal distribution
76. What lists the relative frequency of each category of a variable, given a specific value of the other variable in the contingency table?
A conditional distribution.
77. probability
the measure of the likeliness for an event to occur
78. T/F: Probability deals with summarizing data.
• False.
• Probability deals with predicted outcomes.
79. Probability relates long-term results to short-term results.
• False.
• Probability relates short-term results to long-term results
80. Define: as the number of repetitions of a probability experiment increases, the proportion with which a certain outcome is observed gets closer to the probability of the outcome.
The Law of Large Numbers
81. What is a repeatable process where the results are uncertain?
experiment
82. What is an outcome?
one specific possible result
83. What is the set of all possible outcome?
sample space, S
84. What is a collection of possible outcomes?
event, E
85. simple events, e
Events with one outcome.
86. What are the Rules of Probabilities?
(2)
1. The probability of any event, P(E), must be greater or equal to zero and less than or equal to 1, i.e., 0≤P(E)≤1

2. The sum of probabilities of all outcomes must be equal to one.
87. T/F: Probabilities can be written as decimals, percents, and fractions.
True
88. If an event is impossible, then its probability must be equal to ____
zero.
89. If an event is a certainty, then its probability must be equal to ____.
one.
90. An unusual event is one that has ____ probability of occurring.
An unusual event is one that has low probability of occurring, i.e., 5% or less.
91. What is the formula for approximating probabilities using the Empirical Approach?
P(E) ~ Relative Frequency of E = (Freq of E) / (# of trials of experiment)
92. If we do not know the probability of a certain event E, we can conduct a series of experiments to approximate it.  This is called ______.
Approximating Probabilities Using the Empirical Approach
93. What is the formula for Computing Probabilities Using the Classical Method?
P(E) = N(E) / N(S)  | S = sample space
94. T/F: The classical method applies to experiments where all possible outcomes have equally likely outcomes.
True
95. What is subjective probability?
A subjective probability is a person’s estimate of the chance of an event occurring. This probability is based on personal judgment.
96. T/F: Subjective probabilities should still be between zero and one, but must obey the laws of probability.
• False.
• Subjective probabilities should still be between zero and one, but may not obey the laws of probability.
97. An economist predicting there is a 20% chance of recession next year would be of what example?
subjective probability

What would you like to do?

Home > Flashcards > Print Preview