The flashcards below were created by user
on FreezingBlue Flashcards.
The degree to which an instrument measures what the evaluator wants it to measure predictor, impact or outcome variables. In a more formal sense validity is also the extent of concordance between a measure and its underlying theoretical values.
The extent to which the latent variable is the underlying cause of item covariation. (Is the latent variable the underlying cause of item covariation or Is the scale measuring what it is intended to measure?)
Represent a single concept or idea. May be multidimensional. However, constructs do not represent hypothesis about relationships between latent variables. They are only hypothesis about how one latent variable influences a set of items (scale) or how the construct is influenced by a set of items used to measure the latent variable.
Problems with face validity:
- Type of content validity.
- The extent to which an instrument appears to measure what it is supposed to measure. Established through expert review.
construct may not be what it appears to be. Self-evident content may not be appropriate- e.g. there may be social desirability issues around responses. “Who’s face”: should the questions be self-evident to the researcher or the participant?
Extent that instrument includes the full breadth/all dimensions of the content of the construct being measured. Established through literature reviews, focus groups, and/or expert review.
Extent to which instrument correlates to the established “Gold Standard” measure.
There are two ways to establish this- concurrent (administer the measures simultaneously) and predictive (administer new measure first, follow with Gold Standard measure).
- Extent to which a set of items correlates with each other (that is, measuring only one construct); extent to which measure relates to other measures consistent with theory--important when no criterion exists.
- convergent--correlates with other measure of same of related constructs;
- divergent-has little correlation with unrelated constructs
- The extent to which an instrument produces the same result, when applied multiple times.-
- test-retest: administered multiple times, high correlation indicates high reliability
- internal consistency: Cronbach’s alpha is a common approach.
Reliability = True Score Variance / Observed Score Variance
- *Scale reliability is the proportion of variance attributable to the true score of the latent variable
- *Scale score should represent the true score as much as possible, and not other extraneous factors
Three Classes of Observables
- •Direct observables
- things we can observe rather simply and directly (e.g., height)
- •Indirect observables
- can be observed through some relatively concrete and agreed upon indicator (e.g.,
- rebounding ability)
- theoretical concepts that are based on empirical observations (data) but cannot be observed directly or indirectly (e.g., basketball creativity or toughness)
Some Measurement Terms
- Concept (e.g., healthy community design)broad, abstract idea we’re interested in
- Construct (e.g., walkable neighborhood)the specific ‘scientific’ thing we’re trying to measure
- Dimension (e.g., safety, land use diversity, aesthetics)homogeneous sub-elements of the construct being studied (1 or more)
(e.g., traffic, availability of nearby shopping)how we actually define/delimit the concept (or its dimensions)
(e.g., speed of traffic, distance to nearest store)how we actually measure the indicator of the concept/dimensions
(e.g., fast vs. slow, actual yards)different values of the variable at a particular level of measurement
Levels of Measurement
Nominal – non-ranked categories (e.g., blood type, gender, yes/no)
Ordinal – ranked/ordered categories (e.g., pain, BMI category)
Interval – equal distance between values (e.g., temperature, IQ, perceived exertion)
Ratio – interval measure with defined zero (e.g., distance, age, number of fruits eaten)
Observed value = true value + error
Error comprised of:
- *random error – more unpredictable (e.g., due to mood)
*systematic error – predictable, occurs in one direction (e.g., poorly calibrated scale, biased survey question)
*Our goal is to remove as much (systematic) error as possible from the measurement scenario
•Clear definitions of concept to be studied
•Careful planning and design (of protocol or instrument)
•Equipment maintenance and testing
•Clear and detailed protocol
•Training on administration
•Pretesting/pilot testing protocol or instrument
- *Not manifest; not directly observable
- *Variable – strength or magnitude changes (across time, place, people, etc.)
*All latent variables have a true score (e.g., the true level of my self-esteem (at the time of measurement))
*This true score of the latent variable is presumed to be the cause of responses to items designed to measure the latent variable (e.g., I am able to do things as well as most other people)
- *Thus, the true score and the item score are correlated
- *and therefore all individual items are correlated with each other (e.g., I feel that I have a number of good qualities)
Random Measurement Error
- Lack of precision in measurement that results from factors that randomly affect the measurement of an attribute across the sample
- *pushes observed scores up or down randomly
*Sum of all random errors = 0
- *Decreases reliability (i.e., precision) of measurement
- *Increased variability in the data (often called noise)
e.g., mood can inflate (i.e., good mood) or deflate (i.e., bad mood) observed score relative to true score
- Error in measurement that results from factors that systematically increase or decrease the scores used to measure an attribute
- *consistently positive or negative, shifting sample mean up or down
*Sum of all systematic errors does NOT
*Systematic error threatens validity (especially if it affects one or more subgroups in the sample)
e.g., loud traffic outside a room where students taking a test
Internal Consistency Reliability
*How homogeneous are the items in a scale?
*Strong internal consistency comes from items in a scale being highly correlated (If I am ‘high’ on one item, I am also ‘high’ on the others)
- *If items are strongly correlated, this is likely because they all share some common cause
- *not due to some other source of variation because each item’s error is independent from the other item’s error
*Sharing a common cause (the latent variable) is the definition of a set of scale items
•Variance can be decomposed into:
1.Variation due to true differences in the characteristic of interest (signal)
2.Error = variation due to factors other than the actual level of the characteristic (noise)
- *“Typical degree of joint departure from average” or the “average cross product”
- *Determine difference of each observation’s score from mean on two variables
- *Multiply each pair together
- *Divide by number of observations
- Correlation coefficient = standardized form
- *Variances all equal to 1.0
- *Eliminates issues of measurement units
If two items are measuring the same latent variable, the variation that they share is the variation driven by the latent variable. If two items are closely related to the same latent variable, they will be closely related to each other as well. (e.g. if true value of latent variable is low, score for all items should be fairly low)
*A form of criterion-related validity
*Correlating one measure with another measure taken at the same time that is thought to be a good indicator of the construct (e.g., competitiveness scale and observed actions during athletic event)
- *A form of criterion-related validity
- *Is a measure consistently related (positively or negatively) to some other (future or past) measure that we trust is a good predictor of the real concept of interest? (e.g., political leaning scale and voting patterns)
Cognitive Interviewing Content Areas
- What is the question asking me?
- Which memories are relevant to this question?
- How do I use these memories to come up with an answer?
- How does this answer fit with the response options provided?
Approaches to Cognitive Interviewing
Think Aloud Method
- 1)freedom from interviewer-imposed bias (nonreactive).
- 2) Minimal interviewer training requirements 3)open-ended format can uncover unanticipated issues.
- Disadvantages:1) Need for subject training. Unusual, an d requires significant training of subjects to elicit substantial responses.
- 2) Limited subject think-allowed proficiency: people aren't good at thinking aloud.
- 3)Burden on subject: think aloud activity places the main burden on the subject (compare to verbal probing where burden is on interviewer).
- 4)Tendency for subject to stray from task. The subject controls the discussion, and so if the participant wanders, it can take a long time to test a few questions.
- 5) It’s difficult to tell whether the participant can answer the question. In the think-aloud task there’s very little emphasis on actually answering the question.
- 6) subjects often fail to supply useful think-alouds. Often, subjects use a direct retrieval system and simply answer the question rather than providing elaborate think aloud streams.
Interviewer asks question, subject answers the question, interviewer asks a probe question, subject answers the probe question, possible further cycles, until interviewer asks the next target question.Probes are hypotheses about what might bias questions.
- 1) Maintain control of the interview, avoid irrelevant discussion
- 2) Investigative focus: Interviewer can focus on areas that seem to be likely to cause potential problems.
- 3) Easy to train subject
- 1) introduces reactivity: probes influence the subject’s response
- 2)potential for bias: probes can have the same sorts of problems as survey questions, and may lead a participant to a particular type of response
- 3)Need for careful training: probing is complex, and the interviewer needs to be trained thoroughly
Cognitive Interview Steps
- STEP 1. READING: Is it difficult for interviewers to read the question in the same way to all respondents?
- STEP 2. INSTRUCTIONS: Look for problems with any introductions, instructions, or CONFLICTING OR INACCURATE instructions, introductions, or explanations COMPLICATED instructions, introductions, or explanations
- STEP 3. CLARITY: Identify problems related to communicating the intent or meaning of the question to the respondent
- STEP 4. ASSUMPTIONS: Determine whether there are problems with assumptions made or the underlying logic.
Types of error in survey research
- Non-observation: Are the right people being observed? coverage, sampling, nonresponse error
- Observation: Are we getting the right answer? interviewer error- questions read incorrectly, errors in recording data, etc. Response error: characteristics of questions, and of respondent processing of those questions may lead to incorrect answers.
- Post-observation error: Are we doing the right things with the data? Processing error-data may be coded or analyzed incorrectly.
- Interpretation error: are we drawing the correct conclusions
- Random error: increases variance
- Systematic error: changes mean
Steps to survey design
- 1) Determine what you want to measure
- 2) Generate an item pool (may be existing items, or new items may be written)
- 3)Question format: scale options, response categories
- 4)Review instruments/Expert Review: questionnaire design, subject matter, questionnaire administration
- 5) Construct questionnaire/Consider validation items: items to detect flaws or confirm construct validity
- 6)Pretest questionnaire
- 7)Administer scale to development sample
- 8) Evaluate scale, optimize length
- Determine how many factors underlie a set of items
- Define meaning of constructs
- Identify items that are performing better or worse than others
When can you use factor analysis?
- Assumes interval level measurement, random sampling, linearity (not curvilinear), normal distribution.
- Requires at least 3 items per construct
Exploratory Factor Analysis
Conducted when you want to identify how many factors are responsible for a set of items, identify constructs.
- -at least three questions for each factor
- -items share conceptual meaning-variables loading on different factors measure different constructs
- -Simple structure: high loading on appropriate factor, near-zero loading on others.
- ~Eliminate items that crossload, or items that load poorly on the construct they are supposed to measure.
- -Factor loading helps identify the factors responsible for covariation in the data
Exploratory Factor Analysis Steps
- Step 1: initial extraction of factors
- -Select extraction method : Principal components or Common Factor Analysis, Maximum Likelihood?
- Step 2: Determine the number of factors to retain
- -Eigenvalue: amount of variance captured by each factor. 1= average amount of variance of 1 item in scale. Values ~1 may not be meaningful
- -Scree test: graph of Eigenvalues. # of factors above elbow.
- -Problem: Elbow isn’t always obvious.
- -Elbow=last substantial drop -Proportion of variance?
- Step 3: Rotate to a final solution
- -Unrotated solution: interpretable if there’s only one factor
- -Orthogonal rotation: interpretable for multiple factors, assumes factors are uncorrelated
- -Oblique rotation: correlation between factors
Principal Component Analysis
- Principal Component analysis is NOT Exploratory factor analysis. Exploratory factor analysis seeks to identify hypothetical variables underlying a set of items.
- Principal component analysis is an artificial variable, consisting of the sums of the observed variables.
In PCA factors account for total variance, not just common variance. Only factor analysis will allow researcher to identify the underlying factors and the variance they drive.
Exploratory factor analysis identifies common variance, while principal component analysis identifies the maximum variance possible
- Items load strongly on their primary factor (.40 or greater)
- Items do not cross load (>.20) on other factors
- Each factor contains at least 3 items
- Items/factor make sense (share conceptual meaning)
What to do with underperforming individual items?
- Remove them from further consideration (if they don’t load on a logical factor or if they cross-load on multiple factors)
- Cognitive interviewing to better understand issues participants encounter in responding to the item(s) and how wording or response format can be improved
Scale Development Process