the assignment of numbers t represent the amount of an attribute present in an object or person, using specific rules
advantages of measurements
removes guess work
provides precise information
less vague than words
levels of measurement [4 levels/classes]
the lowest level, involves using numbers simply to categorize attributes i.e. gender and blood types. DO NOT have quantitative meanings
ranks people based on relative standing on an attribute [putting things into categories but… rec 2:35] i.e. Braden scale. It DOES NOT tell us how much greater one level is than another
objects ordered on a scale that has equal distances between points on the scale
occur when researchers can rank people on an attribute and specify the distance between them i.e. psychological tests, an IQ score. Measures CAN BE averaged, and many statistical procedures require interval data
equal distances between score units; there is a rational, meaningful zero
the highest level of measurements. Have a meaningful zero and thus provide information about the absolute magnitude of the attribtrary zero point i.e. weight, height, distance.
which is the highest level of measurement?
we can think of quantitative data as consisting of two parts. what are they?
is the observed score, An actual data value for a participant
i.e. a patients heart rate or a score on an anxiety scale
is what we would get if we had an infallible measure – its what we obtain if everything is exact [environment etc.]
error of measurement
the error of measurement, caused by factors that distort measurement
some errors are random, while others are systematic representing a source of bias
obtained score equation
Obtained Score = True score ± Error
factors contributing to measurement errors
transitory personal factors [e.g., fatigue]
scores can be affected by the conditions under which they are produced
i.e. environmental factors
transitory personal factors
temporary states such as fatigue, hunger, or mood can influence people's motivation or ability to cooperate, act naturally, or do their best
enduring characteristics of respondents can interfere with accurate measures
the ways in which an instrument is realized or put into use
errors can reflect the sampling or items used to measure an attribute
is an evaluation of the quality of a measuring instrument
hey criteria in a psychometric assessment
validity [of an instrument]
the consistency and accuracy with which an instrument measures the target attribute
reliability assessments invlive computing a _?_
reliability coefficients can range from ...?
from .00 to 1.00
coefficients below .70 are considered unsatisfactory
coefficients of .80 or higher are desirable
what are the three aspects of reliability of interest to quantitative researchers
the extent/degree to which scores are similar on two separate administrations of an instrument
stability is evaluate [assessed] by what?
what does test-retest reliability require
requires participants to complete the same instrument on two occasions
appropriate for relatively enduring attributes [e.g, creativity]
the extent to which all the items on an instrument are measuring the same unitary attribute [can measure the same trait
appropriate for most multi-item instruments
the most widely used approach to assessing reliability [in nursing research]
assessed by computing coefficient alpha [Cronbach's alpha]. Alphas ≥.80 are highly desirable. the higher the coefficient = more accuracy
internal consistency is evaluated by what?
by administrating instrument on one occasion
the degree of similarity between alternative forms of an instrument or between multiple rater/observers using an instrument
most relevant for structured observations
primarily concerns the degree to which two or more independent observers or codes agree about scoring on an instrument
equivalence is assessed by?
assessed by comparing agreement between observations or rating of two or more observers [interobserver/interrater reliability]
low reliability can undermine adequate testing of hypotheses
reliability estimates vary depending on procedure used to obtain them
reliability is lower in homogeneous then heterogeneous sample
reliability is lower in shorter than longer multi-item scales.
the degree to which an instrument measures what it is supposed to measure
remember that reliability and validity are NOT independent qualities of an instrument. however an instrument CAN be reliable without being valid
four aspects of validity
an instrument's _?_ reliability does not provide evidence for its _?_, but _?_ _?_ of a measure is evidence of _?_ _?_
an instruments _low_ reliability does not provide evidence for its _validity_, but _low__reliability_ of a measure is evidence of _low__validity_
refers to whether the instruments looks as though it is an appropriate measure of the construct
based on judgement; no objective criteria for assessment
the degree to which an instrument has an adequate sample of items for the construct being measured
An instruments content validity in based on judgment. There are no totally objective methods for ensuring adequate content coverage
content validity is evaluated by what?
by expert evaluation, often via a quantitative measure -> the content validity index [CVI]
content validity index [CVI]
Indicate the extent of expert agreement
value of 0.90 or higher is the standard for establishing excellence in a scale's content validity
the degree to which the instrument is related to an external criterion
It assists decision-makers by giving them some assurance that their decisions will be fair, appropriate, and, in short, valid.
validity coefficient is calculated by what?
by analyzing the relationship between scores on the instrument and criterion
two types of criterion-related validity
the instruments ability to distinguish people whose performances differs on a future criterion
the instrument's ability to distinguish individuals who differ on a present criterion
what is a desirable coefficient value for criterion-related validity
0.70 or higher
construct validity is concerned with what types of questions?
what is this instrument really measuring?
does it adequately measure the construct of interest?
methods for assessing construct validity
testing relationship based on theoretical predictions
construct validity is a key criterion for assessing what?
assessing research quality and is most often linked to measurement. it is essentially a hypothesis-testing endeavor, typically linked to theoretical conceptualizations
groups are expected to differ on the targets attributes are administered the instrument, and group scores are compared
testing relationships based on theoretical predictions
can be fallible but offers supporting evidence
a method for identifying clusters of related items for a scale. identifies and groups together different measures into a unitary scale on how participants reacted to the items, rather than base on the researcher's preconceptions
what needs to be evaluated for screening and diagnostic instruments?
the instrument’s ability to correctly identify noncases, that is, to screen out those without the condition [yielding true negatives]
the instruments’ ability to correctly identify a “case” [yielding true positives]—i.e., to diagnose a condition
Summarizes the relationship between sensitivity and specificity in a single number
LR+: the ratio of true positives to false positives
LR-: the ratio of false negatives to true negatives