Statistics Ch 5

  1. Regression line
    • a straight line that describes how a response variable y changes as an explanatory variable x changes. One variable explains or predicts the other.
    • May be used to predict the value of y for a given value of x.
  2. Least-squares regression line:
    • the unique line such that the sum of the squared vertical
    • (y) distances between the data points and the line is the smallest possible.
  3. Facts about least-squares regression:
    • 1. The distinction between explanatory and response variables is essential in regression.
    • 2. There is a close connection between correlation and the slope of the least-squares line.
    • 3. The least-squares regression line always passes through the point ( x , y )
    • 4. The correlation r describes the strength of a straight-line relationship. The square of the correlation, r2, is the fraction of the variation in the values of y that is explained by the least-squares regression of y on x.
  4. Equation of least-squares regression line:
    Image Upload 2
  5. Coefficient of determination, r2
    r2: the fraction of the variance in y (vertical scatter from the regression line) that can be explained by changes in x.
  6. Residuals
    dist. ( y - yˆ) = residual
  7. Residual plots
    • Residuals are the distances between y-observed and y-predicted. We plot them in a residual plot.
    • If residuals are scattered randomly around 0, chances are your data fit a linear model, were normally distributed, and you didn’t have outliers.
    • The x-axis in a residual plot is the same as on the scatterplot.
    • The line on both plots is the regression line.
  8. Outlier:
    An observation that lies outside the overall pattern of observations.
  9. Influential individual
    • An observation that markedly changes the regression if removed.
    • This is often an outlier on the x-axis.
  10. Interpolation
    • Making predictions
    • The equation of the least-squares regression allows you to predict y for any x within the
    • range studied. This is called interpolating.
  11. lurking variable
    • is a variable not included in the study design that does have an effect
    • on the variables studied.
    • It can falsely suggest a relationship.
  12. Confounded variables
    • Two variables are confounded when their effects on a response variable cannot be
    • distinguished from each other. The confounded variables may be either explanatory
    • variables or lurking variables.
  13. Extrapolation
    is the use of a regression line for predictions outside the range of x values used to obtain the line.
Author
firefly501
ID
137598
Card Set
Statistics Ch 5
Description
Regression Relationship between two variables
Updated