GU Research Mod 10

Card Set Information

GU Research Mod 10
2015-11-18 06:52:06
Polit Beck Research
Mod 10 Chapter 18 Multivariate Stats
Show Answers:

  1. More than one dependent variable and/or more than one independent variable and their relationships/correlation/etc
    Multivariate procedures
  2. used to analyze the effects of two or more independent variable on a continuous dependent variable
    multiple regression analysis
  3. multiple ______ and multiple _____ will be used almost interchangeably
    • regression
    • correlation
  4. _______ analysis is used to make predictions
  5. one independent variable (X) is used to predict a dependent variable (Y)
    simple regression
  6. used to determine a straight line fit to the data that minimizes deviations from the line
    linear regression
  7. most are small; occur because the correlation between X and Y is not perfect (only when r= 1.00 or -1.00 are they perfect)
    errors of prediction (e)
  8. standard regression is said to use this because the regression equation solves for a and b in a way that minimizes errors of prediction; more precisely, the solution minimizes the sums of squares of prediction errors
    least squares criterion
  9. standard regression is sometimes called this.ordinary least square (OLS) regression
    ordinary least square (OLS) regression
  10. error terms
  11. Expresses how variation in one variable is associated with variation in another; if r = 0.9 then r squared =0.81 meaning 81% of the variability in Y values can be understood in terms of variability in X values.
    correlation coefficient (r?):
  12. With correlation coefficients, the stronger the correlation, the better the _______ (the stronger the correlation, the greater the _______ of variance explained)
    • prediction
    • percentage
  13. The index when using two or more independent variable (Pearson’s r is used with bivariate correlation)
    multiple correlation coefficient (R)
  14. R (unlike r) does not have negative values so it can show the _____ of relationship between several independent variables and a dependent variable.
  15. R cannot be ______; it ranges from ___ to ____.
    • negative
    • 0-1
  16. R is based on _______ scores.
  17. R can show the _____ of a prediction or relationship but NOT the _______.
    • strength
    • direction
  18. ___________ predicts a DV from more than 1 IV.
    Multiple Linear Regression
  19. What does R squared tell you?
    how much all the IVs contribuite to DV
  20. What should you do to learn how much influence each IV has on the DV?
    Look at the Beta weight
  21. Three ways of entering predictor variables.
    • Simultaneous
    • Hierarchical
    • Stepwise
  22. Dependent variables in multiple regression analysis (ANOVA) should be measured on a _________ scale; independent variables can be _________.
    • interval or ratio
    • interval or ratio OR categorical
  23. When a regression coefficient (b) is divided by its standard error, the result is a value for the t statistic, which can be used to assess the significance of ____________.
    individual predictors
  24. A significant t indicates that the regression coefficient (b) is significantly __________.
    different from zero
  25. In ____________, the coefficients represent the number of units the dependent variable is predicted to change for each unit change in a given independent variable when the effects of other predictors are held constant (they are statistically controlled) - can enhance a study’s internal validity.
    multiple regression
  26. enters all predictor variables into the regression equation at the same time; there is no basis for considering any particular predictor as causally prior to another.
    multiple regression
  27. involves entering predictors into the equation in a series of steps; researchers control the order of entry (typically based on theoretical considerations).
    hierarchical multiple regression
  28. empirically selecting the combination of independent variables with the most predictive power.
    stepwise multiple regression
  29. the regression coefficients for each z are standardized regression coefficients called ?
    beta weights
  30. ___________ eliminate the problem of differing units by transforming all variables to scores with a mean of 0.0 and a standard deviation of 1.00
    standard scores (z scores)
  31. __________ are the difference between a score and the mean of that score divided by the standard deviation
    z scores
  32. What is the problem with beta weights?
    the regression coefficients will be the same no matter what the order of entry of the variables, but they are unstable, the value of beta weights tend to fluctuate from sample to sample and change if a variable is added to or subtracted from the regression equation so it is difficult to attach theoretical importance to them
  33. Power Analysis for Multiple Regression: a ratio of ______ for simultaneous and hierarchical regression and a ratio of ______ for stepwise
    • 20:1
    • 40:1
  34. Power Analysis for Multiple Regression: N should be greater than _________ times the number of predictors (independent variables)
    50 + 8
  35. An estimation of the number of participants needed to reject the null that R equals zero based on effect size, number of predictors, desired power, and the significance criterion
    power analysis
  36. used to compare the means of two or more groups, adjusts for initial differences so that the results more precisely reflect the effect of an intervention
    Analysis of Covariance (ANCOVA):
  37. offers post-hoc statistical control- assumes randomization.
  38. ___________ can statistically control for pretest scores  - the posttest score is the DV and the IV is experimental/comparison group status and the covariate is pretest scores
  39. usually continuous variables (ex: anxiety scores) but can sometimes be dichotomous variables (male/female)
  40. independent variable for covariates is a ______-level variable
  41. covariates should be variables that you suspect are correlated with the ________ variable
  42. techniques that fit data to straight-line (linear) solutions; foundation for the t-test, ANOVA, and multiple regression
    general linear model (GLM):
  43. group of means on the dependent variable after removing the effect of covariates
    adjusted means
  44. adjusted means allow researchers to determine _________.
    net effects
  45. techniques that fit data to straight-line (linear) solutions; foundation for such procedures as the t-test, ANOVA, and multiple regression
    general linear model (GLM):
  46. used to test the significance of differences in group means for multiple dependent variables.
  47. allows for the control of confounding variables (covariates) when there are two or more dependent variables.
  48. makes predictions about membership in groups; ex: predict membership in such groups as compliant vs noncompliant patients
    • discriminant analysis
    • (equation is called discriminant fxn)
  49. an equation developed using discriminant analysis for a categorical dependent variable, with independent variables that are either dichotomous or continuous
    discriminant function
  50. researchers begin with data from people whose group membership is known and develop an equation to predict membership when only measures of the independent variables are available - the _________ indicates to which group each person would likely belong
    discriminant function
  51. indicates the proportion of variance unaccounted for by predictors
    Wilkes’ lambda
  52. analyzes the relationship between multiple independent variables and a dependent variable; used to predict categorical dependent variables
    logistic regression
  53. used in logistic regression to estimate the parameters most likely to have generated the observed data
    maximum likelihood estimation (MLE):
  54. the factor by which the odds change; provides an estimate
    odds ratio
  55. dependent variable in binary logistic regression is a _______ variable
  56. _______ variables can be continuous variables, categorical variables, or interaction terms; can be entered in an equation in different ways (simultaneous, hierarchical, and stepwise)
  57. ________ variables (indicator variables) are a common method of representing dichotomous predictors
  58. one group in an analysis of a variable with more than two categories, given a OR of 1.0 and the other groups (categories of the variable) would have OR’s in relation to the ___________.
    reference group
  59. based on the residuals for all cases in the analysis (the difference between the observed probability of an event and the predicted probability)
    goodness-of-fit statistic
  60. compares the prediction model to a hypothetically “perfect” model (one that contains the exact set of predictors needed to duplicate the observed frequencies in the dependent variable)
    Hosmer-Lemeshow test
  61. to test the significance of individual predictors in the model; distributed as a chi-square
    Wald Statistic
  62. most frequently reported pseudo R squared index
  63. widely used by epidemiologists when the dependent variable is a time interval between an initial event (onset of a disease) and a terminal event (death)
    survival analysis
  64. time-related data are ________ when the observation period does not cover all possible events
  65. testing a hypothesized causal explanation of a phenomenon, typically with data from non experimental studies.
    Causal Modeling
  66. Two approaches to causal modeling.
    • Path analysis
    • Structural equations modeling (SEM)
  67. a method for studying causal patterns among variables; not a method for discovering causes (uses least-squares estimation)
    path analysis
  68. Model of path analysis where causal flow is unidirectional (variable 2 is a cause of variable 3, and variable 3 is NOT a cause of variable 2)
    recursive model
  69. the weights representing the effect of one variable on another; indicates the proportion of a standard deviation difference in the caused variable that is directly attributable to a 1 SD difference in the specified causal variable
    path coefficient
  70. uses maximum likelihood estimation and is a more powerful approach than path analysis (assumes causal flow is recursive/non directional, variables are measured without error, and residuals are uncorrelated - both not usually plausible)
    structural equations modeling (SEM):
  71. can accommodate measurement errors, correlated residuals, and nonrecursive models (allows for reciprocal causation)
    structural equations modeling (SEM):
  72. can be used to analyze causal models involving latent variables (an unmeasured variable corresponding to an abstract construct) two phases
    structural equations modeling (SEM):
  73. Multivariate statistics allow for what two things?
    • to examine complex phenomena
    • to move have 3 or more variables
  74. In ________ one IV is used to predict a DV.
    Simple Linear Regression
  75. What does R squared tell you?
    • accuracy of a prediction equation
    • (How much all IVs contribute to the DV)
  76. Sample size for simultaneous multiple regression.
    20:1 (20 or more per IV)
  77. Sample size for hierarchical multiple regression.
    20:1 (20 or more per IV)
  78. Sample size for Stepwise multiple regression.
    40:1 (40 or more per IV)
  79. Researchers often try to improve predictions of Y by including multiple IVs, which are often called _______ variables in a multiple regression context.
  80. What is the index in bivariate correlation? With two or more IVs?
    • Pearson's r
    • multiple correlation coefficient (R)
  81. The proportion of variance in Y accounted for by the combined, simultaneous influence of the IVs.
    R squared
  82. R is never less than the highest r b/w a _______ and the _______.
    • predictor
    • DV
  83. What does a high correlation amond IVs do to the predictive power?
    decreases it
  84. What happens to increments to R as more IVs are added to the regression equation?
    they decrease
  85. What is difficult to avoid as more and more variables are added to the regression equation?
  86. Three tests of significance for mult linear regression.
    • Tests of Overall Equation and R
    • Tests for Adding Predictors
    • Tests of the Regression Coefficients
  87. What is the basic null hypotheis in a multiple regression?
    R= ZERO

    (R= population multiple correlation coefficient)
  88. What is used to decide if a third predictor will increase the ability to predict Y after two predictors have been used?
    • F-statistic
    • (tests for adding predictors)
  89. A significant t indicates that the regression coefficient is what?
    significantly different from zero
  90. In simple regression, the ______  indicates the amt of change in predicted values of Y, for a specified rate of change in X. In multiple regression, the _______ represent the number of units the DV is predicted to change for each unit change in a given IV.
    • value of b
    • coefficients
  91. Strategy used when there is no basis for considering any particular predictor as causally prior to another and when the predictors are of comparable importance to the research problem.
    Simultaneous Multiple Regression
  92. Any data for which ANOVA is appropriate can be analyzed by __________, but the reverse is not true.
    multiple regression
  93. used to examine the effect of a key independent variable after first removing (controlling) the effect of confounding variables
    Hierarchical multiple regression
  94. the analog of the overall F test in multiple regression (chi-squared distribution)
    Goodness-of-fit statistic
  95. researchers posit causal linkages among three or more variables and then test whether hypothesized pathways from the causes to the effect are consistent with the data
    Causal Modeling