Reliability and Validity

Card Set Information

Reliability and Validity
2012-10-15 18:49:04
construct validity reliability internal external statistical conclusion

construct validity, reliability, internal validity, external validity, statistical conclusion validity
Show Answers:

  1. What is a construct?
    • An underlying trait that is reesponsible for some observeable behavior
    • Indirect measure
  2. What is construct validity?
    • Refers to inferences made from measured variables to theoretical constructs
    • Examines how well the assessment “matches up with” the construct
  3. What is validity?
    —“the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of the test”
  4. What are the two threats to validity?
    • Construct Underrepresentation (Less)
    • —Construct Irrelevant Variance (More)
  5. What is construct underrepresentation?
    • Not measuring the construct as broadly as intended
    • “the degree to which a test fails to capture important aspects of the construct”*
  6. What is construct irrelevant variance
    —Measuring consistently something that is not part of the construct
  7. What are the five sources of validity evidence?
    • Evidence based on Test Content
    • Evidence based on Response Processes
    • Evidence based on Internal Structure
    • Evidence based on Relations to Other Variables
    • Evidence based on Consequences of Testing
  8. What does test content refer to?
    • Analysis of the relationship between the test’s content and the construct of interest
    • Themes, wording, and format of the items, tasks, or questions on a test, as well as the procedural guidelines for its administration and scoring
  9. What does response process validity refer to?
    • Analyses of the response processes of examinees is used to determine the fit between the construct and  the examinee’s actual performance or response
    • Includes the processes of judges, raters, or observers when evaluating examinee’s performances .
  10. What types of evidence support content validity?
    • Use logical or empirical analysis
    • —Most frequently relies of “expert” judgment
    • Part of test development
  11. What types of evidence support response process validity?
    • Theoretical and empirical
    • Process focused on a study of examinees
    •      Interview test takers
    •      Think alouds
    • Examine rating or scoring process
  12. What does internal structure refer to?
    The degree to which the relationship among test items and test components conform to the construct which the interpretations are based.
  13. What types of evidence support internal structure validity?
    • Item analysis
    • Reliability
    • Factor analysis
  14. What does external validity refer to?
    • relationship between the test scores and variables outside of the test
    • ie: GRE predicts student success in graduate school
    • Relationship to other tests
  15. What types of evidence support external validity?
    • Convergent and discriminant evidence
    • Test criterion evidence (predicts and a function of purpose)
    • Validity Generalization (Can the instrument be generalized to other situations)
  16. What is convergent evidence?
    Relationship between scores on measures of the same construct
  17. What is discriminant evidence?
    Relationship between test scores and measures of different constructs, should not be related
  18. What does consequential (consequences of testing) refer to?
    • Intended consequences happen
    • Unintended consequences are not occuring especially issues of bias, fairness
  19. What types of evidence support consequences of testing?
    • Intended
    • Shaping the curriculum
    • Teaching the content of the test
    • Unintended Consequences
    • Narrowing the curiculu
    • Teaching to the test and only the test
    • Subpopulation differences
    • Develops over time
  20. What is internal  validity?
    The extent to which a cause and effect relationship is isolated from competing influences
  21. What is the purpose of internal validity?
    • Determines the soundness of conclusions/interpretations of a causal relationship
    • Are the variables influenced by different variables
  22. What are the four categories of threats to internal validity?
    • Time threats
    • Group threats
    • Mortality
    • Atypical behavior
  23. What is a time threat in internal validity?
    Impacts on dependent variable over time are different because of factors other than the treatment variable
  24. What are group threats to internal validity?
    • Explanations for changes in the variables other than experimental differences created by the researcher
    • Selection of the group may cause
  25. What are mortality threats to internal validity?
    • Loss of subjects in a study
    • Impacts all studies long enough to have dropouts
  26. What are atypical behavior threats to internal validity?
    • Research design cannot eliminate
    • Actions of those in the groups which change the treatment (Treatment and control group interact about the intervention)
  27. What is testing as a time threat and possible solutions?
    • The effect of the first test on the second
    • Effect of a publication about the treatment
    • Solutions:
    • Lengthen time interval between tests
    • Disguise use of prestest (don't disclose next test)
    • Use control groups
  28. What is history (time threat) and possible solutions?
    • An event that occurs during treatment that affects subject response
    • Usually events that could be controlled
    • Solutions
    • Use control groups
    • Use shorter time
  29. What is maturation as a time threat and possible solutions?
    • Naturally occuring process within participants that occur because of time and may change their performance
    • Inc. fatigue, boredom, growth, intellectual development
    • Solutions:
    • Shorten the time of the study
    • Use a control group with a similar maturation rate
  30. What is instrumentation as a time threat and possible solutions?
    • Changes in measurement procedure in a pretest - posttest study
    • Inc: calibration, rater changes, score use
    • Solutions:
    • control group
    • standardize the measurement procedure
  31. What is pre-experimental research design?
    • One group with post-test only such as pilot testing
    • Or
    • One group with pre-and post test
    • These are weakest designs because they do not strongly link group changes to treeatment
  32. What are quasi experimental designs?
    • Designs that involve a control group but do not use random selection
    • Or
    • One experimental group with multiple tests, the first is a baseline
  33. What are possible threats to quasi experimental design?
    • Time threats
    • Selection (group changes at time of intervention)
    • Instrumentation
    • Maturation
  34. What is experimental design?
    • Pre-test, post test, control group design w/random assignment
    • Post test control group design with random assignment
    • Still has threats to validity but usually eliminates group threats
  35. How to improve internal validity?
    • Use random assignment
    • Use Pretest
    • Use control/comparison group
  36. What are types of statistical designs?
    • Within-subject designs
    • Between subjects designs
    • Mixed designs
  37. What are types of experimental designs?
    • Pre-experimental designs
    • Quasi experimental designs
    • Experimental design
  38. What is a within subject design?
    • Measures individuals within a group multiple times both before and after treatment
    • Change over time is of interest
    • Applies to pre-experimental and quasi experimental
  39. What is between subjects design?
    • Compares the scores of two or more groups
    • Group comparison are of interest
    • Applies to quasi-experimental and experimental designs
  40. What is mixed design (split-plot) ?
    • Both between and within designs used
    • ie: look at individual student scores as a result of a treatment in two classes
    • Applies to quasi-experimental and experimental
  41. How can you control for differences between groups?
    • Statistical adjustment based on theoretical argument
    • Matching of variables between groups
    • Random assignment creates random equivalence on all variables
    • Matching and Random Assignment
  42. What are group threats to internal validity?
    • Regression towards the mean
    • Selection
    • Selection by time interactions
    •      selection by maturation
    •      selection by history
  43. What is regression to the mean?
    • Applies to pre-post design
    • Scores (both high and low) regress to the mean
    • Inflates low group and deflates high group change
    • Impacts in gain score situations with extreme groups
  44. How to mediate the effects of regression to the mean?
    • Avoid comparison of extreme groups if possible
    • Retest-retest to establish a more "stable" baseline
    • Use a control group for each extreme
    • Use high reliability measures
  45. What is selection threat?
    • A threat that is due to the different group characteristics present at the start of the study
    • Affects all quasi-experimental studies
  46. What are interactions with selection?
    • Internal validity threats interact with selection to produce effects not due to treatment
    • Selection maturation (mature at different rates)
    • Selection history (groups from different settings so different events)
  47. How to mediate the effects of selection and interactions?
    • Random assignment
    • Matching (only relates to variables being matched)
    • Random assignment and matching
    • Check pretest equivalence applies to quasi and true experimental design (ANOVAs testing for this)
  48. What are mortality threats?
    • subjects leave the study for different/systematic non-random reasons
    • results in selection artifact because posttest group is different
  49. How to mediate impact of mortality threats?
    • Control groups if mortality cause is the same
    • Shorter time interval between start and finish
    • Monitor incidence of mortality
    • Use pretest to compare scores of those who dropped and those who did not
  50. What are atypical behavior threats?
    • not considered part of validity by all because cannot be controlled by design
    • Differences between groups not caused by the design
    • Caused by group communication or public knowledge of treatment groups
  51. How to mediate impact of atypical behavior threats?
    • NOT conrolled by random assignment
    • Monitor the research project
    • If possible prevent groups from communicating with each other
  52. How is external validity determined?
    look at threats to the sample which prevents generalizing to the population of interest
  53. What is a population?
    Complete set of observations about which we draw conclusions
  54. What is an experimentally accessible population?
    the subset of the population from which the sample is drawn
  55. What is a sample?
    Actual observations included in the study
  56. What are the steps in selecting a sample?
    • 1. Define the observation unit (individual or object)
    • 2. Define the target population
    • 3. Define the boundaries of the population (affects generalizability)
    • 4.Define the sampling technique
    • 5. Obtain a sampling frame to use with the technique
    • 6. Select the sample
  57. What are the types of samples?
    • Probability samples
    •      Simple random, Systematic, stratified, cluster
    • Non-probability samples:
    •      Convenience, quota, purposive
  58. What is a simple random sample?
    • Every item has an equal chance of being selected
    • Has no control on sample make-up
  59. What is s systmatic sampling?
    • Select every kth element of sample
    • Starting point should be randomly selected
  60. What is stratified sampling?
    • Separate samples are created for each strata (characteristic of interest)
    • Ensures each characteristic of interest is represented
    • Can be weighted in relation to population
  61. What is cluster sampling?
    • The population is divided into heterogeneous clusters
    • Clusters should be already formed in the population
    • One cluster is randomly selected
  62. What is multi-stage cluster sampling?
    • Two step procedure after clusters are identified
    • 1. Random selection of a cluster
    • 2.  Random sample in each cluster, often based on strata
  63. What is probalistic cluster sampling?
    • Random or stratified random (can underrepresent larger units)
    • Probability Proportionate to size (units are weighted based on size of unit)
  64. What is convenience sampling?
    • Non-probabalistic sampling
    • Based on easily accessible elements in a population
    • Subject pools
    • Volunteers
    • Often subject to all external validity threats
  65. What is quota sampling?
    • Non-probabalistic sampling
    • Like stratified without random element
    • Begin with a matrix targeting characteristics
    • Collect data from each person having the characteristics
    • Tries to represent the population
  66. What is purposive sampling or judgmental sampling?
    • Non-probabalistic sampling
    • Used for pilot testing or manipulation
    • Use when members of subset are easy to identify but hard to include all
    • Use when only generalizing to that subset
  67. What is sampling error?
    • The difference between the true score of the sample and the true score of the population
    • Can be estimated with probalistic sampling methods
    • NOT an error made in creating the sample
  68. What is sampling frame: coverage bias
    • A listing of the population from which a sample will be drawn
    • As representative population is formed, consider if groups are not represented how results will be impacted
  69. What is non-response bias?
    • Bias results from non-response if they are different than responders
    • Is there a systematic difference between those that did not participate and those that did?
    • Can be accounted for through statistical modeling
  70. What is external validity?
    • How well do your findings generalize across settings, samples and times?
    • How can we generalize to the population on the basis of sample-based results?
  71. What are threats to external validity?
    • Interaction of selection and treatment
    • Interaction of setting and treatment
    • Interaction of history and treatment
  72. What are threats to interaction of selection and treatment?
    • Occurs when treatment effects only generalize to those selected in the same way as the sample
    • Applies to non-probalistic sampling
    • Solutions
    • random selection
    • make participation convenient
  73. What are threats to interaction of setting and treatment?
    • Treatment effects generalize only to settings used in the study
    • Solution:
    • Vary the settings and analyze IV/DV within settings
  74. What are threats to interaction of history and treatment?
    • Look at when you collected your sample, will it generalize across other time frames
    • Solution:
    • Replicate the study at different times
    • Conduct research review to see if prior evidence refutes the relationship
  75. What is sampling error?
    Sampling error is the discrepency between a sample statistic and the corresponding population parameter.
  76. What are null and alternative hypothesis?
    • Null: What is assumed to be true about a situation at the start of an experiment
    • Alternative: is true is null is false, usually research is trying to prove null is no longer true because of treatment
  77. What is a Type I error?
    • Null hypothesis is true but you reject it and conclude the alternative is true
    • You are concluding treatment was effective when it was not
  78. What is a Type 2 error?
    • The null hypothesis is false but you decide not to reject it.
    • You conclude the treatment was ineffective when it was
  79. What is significance level?
    • The probability of making a Type I error
    • Most common is .05 (needs to be lower is loss of income, job, health, life is at stake)
    • Defines an unlikely sample assuming a null hypothesis is true
  80. What is power?
    • The probability you correctly reject a false null hypothesis
    • or
    • We reject null when alternative is true
    • (1-BETA)
  81. What are influences on power?
    • Increase the following to increase power
    • sample size
    • significance level (but inc significance makes errors more likely)
    • effect size
    • Increasing variance DECREASES power