The flashcards below were created by user
firefly501
on FreezingBlue Flashcards.

Regression line
 a straight line that describes how a response variable y changes as an explanatory variable x changes. One variable explains or predicts the other.
 May be used to predict the value of y for a given value of x.

Leastsquares regression line:
 the unique line such that the sum of the squared vertical
 (y) distances between the data points and the line is the smallest possible.

Facts about leastsquares regression:
 1. The distinction between explanatory and response variables is essential in regression.
 2. There is a close connection between correlation and the slope of the leastsquares line.
 3. The leastsquares regression line always passes through the point ( x , y )
 4. The correlation r describes the strength of a straightline relationship. The square of the correlation, r2, is the fraction of the variation in the values of y that is explained by the leastsquares regression of y on x.

Equation of leastsquares regression line:

Coefficient of determination, r2
r2: the fraction of the variance in y (vertical scatter from the regression line) that can be explained by changes in x.

Residuals
dist. ( y  yˆ) = residual

Residual plots
 Residuals are the distances between yobserved and ypredicted. We plot them in a residual plot.
 If residuals are scattered randomly around 0, chances are your data fit a linear model, were normally distributed, and you didn’t have outliers.
 The xaxis in a residual plot is the same as on the scatterplot.
 The line on both plots is the regression line.

Outlier:
An observation that lies outside the overall pattern of observations.

Influential individual
 An observation that markedly changes the regression if removed.
 This is often an outlier on the xaxis.

Interpolation
 Making predictions
 The equation of the leastsquares regression allows you to predict y for any x within the
 range studied. This is called interpolating.

lurking variable
 is a variable not included in the study design that does have an effect
 on the variables studied.
 It can falsely suggest a relationship.

Confounded variables
 Two variables are confounded when their effects on a response variable cannot be
 distinguished from each other. The confounded variables may be either explanatory
 variables or lurking variables.

Extrapolation
is the use of a regression line for predictions outside the range of x values used to obtain the line.

