Home > Preview
The flashcards below were created by user
MeganM
on FreezingBlue Flashcards.

More than one dependent variable and/or more than one independent variable and their relationships/correlation/etc
Multivariate procedures

used to analyze the effects of two or more independent variable on a continuous dependent variable
multiple regression analysis

multiple ______ and multiple _____ will be used almost interchangeably

_______ analysis is used to make predictions
regression

one independent variable (X) is used to predict a dependent variable (Y)
simple regression

used to determine a straight line fit to the data that minimizes deviations from the line
linear regression

most are small; occur because the correlation between X and Y is not perfect (only when r= 1.00 or 1.00 are they perfect)
errors of prediction (e)

standard regression is said to use this because the regression equation solves for a and b in a way that minimizes errors of prediction; more precisely, the solution minimizes the sums of squares of prediction errors
least squares criterion

standard regression is sometimes called this.ordinary least square (OLS) regression
ordinary least square (OLS) regression


Expresses how variation in one variable is associated with variation in another; if r = 0.9 then r squared =0.81 meaning 81% of the variability in Y values can be understood in terms of variability in X values.
correlation coefficient (r?):

With correlation coefficients, the stronger the correlation, the better the _______ (the stronger the correlation, the greater the _______ of variance explained)

The index when using two or more independent variable (Pearson’s r is used with bivariate correlation)
multiple correlation coefficient (R)

R (unlike r) does not have negative values so it can show the _____ of relationship between several independent variables and a dependent variable.
strength

R cannot be ______; it ranges from ___ to ____.

R is based on _______ scores.
standardized

R can show the _____ of a prediction or relationship but NOT the _______.

___________ predicts a DV from more than 1 IV.
Multiple Linear Regression

What does R squared tell you?
how much all the IVs contribuite to DV

What should you do to learn how much influence each IV has on the DV?
Look at the Beta weight

Three ways of entering predictor variables.
 Simultaneous
 Hierarchical
 Stepwise

Dependent variables in multiple regression analysis (ANOVA) should be measured on a _________ scale; independent variables can be _________.
 interval or ratio
 interval or ratio OR categorical

When a regression coefficient (b) is divided by its standard error, the result is a value for the t statistic, which can be used to assess the significance of ____________.
individual predictors

A significant t indicates that the regression coefficient (b) is significantly __________.
different from zero

In ____________, the coefficients represent the number of units the dependent variable is predicted to change for each unit change in a given independent variable when the effects of other predictors are held constant (they are statistically controlled)  can enhance a study’s internal validity.
multiple regression

enters all predictor variables into the regression equation at the same time; there is no basis for considering any particular predictor as causally prior to another.
multiple regression

involves entering predictors into the equation in a series of steps; researchers control the order of entry (typically based on theoretical considerations).
hierarchical multiple regression

empirically selecting the combination of independent variables with the most predictive power.
stepwise multiple regression

the regression coefficients for each z are standardized regression coefficients called ?
beta weights

___________ eliminate the problem of differing units by transforming all variables to scores with a mean of 0.0 and a standard deviation of 1.00
standard scores (z scores)

__________ are the difference between a score and the mean of that score divided by the standard deviation
z scores

What is the problem with beta weights?
the regression coefficients will be the same no matter what the order of entry of the variables, but they are unstable, the value of beta weights tend to fluctuate from sample to sample and change if a variable is added to or subtracted from the regression equation so it is difficult to attach theoretical importance to them

Power Analysis for Multiple Regression: a ratio of ______ for simultaneous and hierarchical regression and a ratio of ______ for stepwise

Power Analysis for Multiple Regression: N should be greater than _________ times the number of predictors (independent variables)
50 + 8

An estimation of the number of participants needed to reject the null that R equals zero based on effect size, number of predictors, desired power, and the significance criterion
power analysis

used to compare the means of two or more groups, adjusts for initial differences so that the results more precisely reflect the effect of an intervention
Analysis of Covariance (ANCOVA):

offers posthoc statistical control assumes randomization.
ANCOVA

___________ can statistically control for pretest scores  the posttest score is the DV and the IV is experimental/comparison group status and the covariate is pretest scores
ANCOVA

usually continuous variables (ex: anxiety scores) but can sometimes be dichotomous variables (male/female)
covariates

independent variable for covariates is a ______level variable
nominal

covariates should be variables that you suspect are correlated with the ________ variable
dependent

techniques that fit data to straightline (linear) solutions; foundation for the ttest, ANOVA, and multiple regression
general linear model (GLM):

group of means on the dependent variable after removing the effect of covariates
adjusted means

adjusted means allow researchers to determine _________.
net effects

techniques that fit data to straightline (linear) solutions; foundation for such procedures as the ttest, ANOVA, and multiple regression
general linear model (GLM):

used to test the significance of differences in group means for multiple dependent variables.
MANOVA

allows for the control of confounding variables (covariates) when there are two or more dependent variables.
MANCOVA

makes predictions about membership in groups; ex: predict membership in such groups as compliant vs noncompliant patients
 discriminant analysis
 (equation is called discriminant fxn)

an equation developed using discriminant analysis for a categorical dependent variable, with independent variables that are either dichotomous or continuous
discriminant function

researchers begin with data from people whose group membership is known and develop an equation to predict membership when only measures of the independent variables are available  the _________ indicates to which group each person would likely belong
discriminant function

indicates the proportion of variance unaccounted for by predictors
Wilkes’ lambda

analyzes the relationship between multiple independent variables and a dependent variable; used to predict categorical dependent variables
logistic regression

used in logistic regression to estimate the parameters most likely to have generated the observed data
maximum likelihood estimation (MLE):

the factor by which the odds change; provides an estimate
odds ratio

dependent variable in binary logistic regression is a _______ variable
dichotomous

_______ variables can be continuous variables, categorical variables, or interaction terms; can be entered in an equation in different ways (simultaneous, hierarchical, and stepwise)
predictor

________ variables (indicator variables) are a common method of representing dichotomous predictors
dummycoded

one group in an analysis of a variable with more than two categories, given a OR of 1.0 and the other groups (categories of the variable) would have OR’s in relation to the ___________.
reference group

based on the residuals for all cases in the analysis (the difference between the observed probability of an event and the predicted probability)
goodnessoffit statistic

compares the prediction model to a hypothetically “perfect” model (one that contains the exact set of predictors needed to duplicate the observed frequencies in the dependent variable)
HosmerLemeshow test

to test the significance of individual predictors in the model; distributed as a chisquare
Wald Statistic

most frequently reported pseudo R squared index
Nagelkerke

widely used by epidemiologists when the dependent variable is a time interval between an initial event (onset of a disease) and a terminal event (death)
survival analysis

timerelated data are ________ when the observation period does not cover all possible events
censored

testing a hypothesized causal explanation of a phenomenon, typically with data from non experimental studies.
Causal Modeling

Two approaches to causal modeling.
 Path analysis
 Structural equations modeling (SEM)

a method for studying causal patterns among variables; not a method for discovering causes (uses leastsquares estimation)
path analysis

Model of path analysis where causal flow is unidirectional (variable 2 is a cause of variable 3, and variable 3 is NOT a cause of variable 2)
recursive model

the weights representing the effect of one variable on another; indicates the proportion of a standard deviation difference in the caused variable that is directly attributable to a 1 SD difference in the specified causal variable
path coefficient

uses maximum likelihood estimation and is a more powerful approach than path analysis (assumes causal flow is recursive/non directional, variables are measured without error, and residuals are uncorrelated  both not usually plausible)
structural equations modeling (SEM):

can accommodate measurement errors, correlated residuals, and nonrecursive models (allows for reciprocal causation)
structural equations modeling (SEM):

can be used to analyze causal models involving latent variables (an unmeasured variable corresponding to an abstract construct) two phases
structural equations modeling (SEM):

Multivariate statistics allow for what two things?
 to examine complex phenomena
 to move have 3 or more variables

In ________ one IV is used to predict a DV.
Simple Linear Regression

What does R squared tell you?
 accuracy of a prediction equation
 (How much all IVs contribute to the DV)

Sample size for simultaneous multiple regression.
20:1 (20 or more per IV)

Sample size for hierarchical multiple regression.
20:1 (20 or more per IV)

Sample size for Stepwise multiple regression.
40:1 (40 or more per IV)

Researchers often try to improve predictions of Y by including multiple IVs, which are often called _______ variables in a multiple regression context.
predictor

What is the index in bivariate correlation? With two or more IVs?
 Pearson's r
 multiple correlation coefficient (R)

The proportion of variance in Y accounted for by the combined, simultaneous influence of the IVs.
R squared

R is never less than the highest r b/w a _______ and the _______.

What does a high correlation amond IVs do to the predictive power?
decreases it

What happens to increments to R as more IVs are added to the regression equation?
they decrease

What is difficult to avoid as more and more variables are added to the regression equation?
redundancy

Three tests of significance for mult linear regression.
 Tests of Overall Equation and R
 Tests for Adding Predictors
 Tests of the Regression Coefficients

What is the basic null hypotheis in a multiple regression?
R= ZERO
(R= population multiple correlation coefficient)

What is used to decide if a third predictor will increase the ability to predict Y after two predictors have been used?
 Fstatistic
 (tests for adding predictors)

A significant t indicates that the regression coefficient is what?
significantly different from zero

In simple regression, the ______ indicates the amt of change in predicted values of Y, for a specified rate of change in X. In multiple regression, the _______ represent the number of units the DV is predicted to change for each unit change in a given IV.

Strategy used when there is no basis for considering any particular predictor as causally prior to another and when the predictors are of comparable importance to the research problem.
Simultaneous Multiple Regression

Any data for which ANOVA is appropriate can be analyzed by __________, but the reverse is not true.
multiple regression

used to examine the effect of a key independent variable after first removing (controlling) the effect of confounding variables
Hierarchical multiple regression

the analog of the overall F test in multiple regression (chisquared distribution)
Goodnessoffit statistic

researchers posit causal linkages among three or more variables and then test whether hypothesized pathways from the causes to the effect are consistent with the data
Causal Modeling

