Home > Preview
The flashcards below were created by user
on FreezingBlue Flashcards.
what is assessment?
it involves some type of measurement,
it involves gathering samples of behavior, making inferences
it is objective & systematic
definition of assessment?
formal and informal, used to quantify a behavior
definition of appraisal?
formal and informal
what is the purpose of tests?
what is the purpose of an instrument?
to provide greater knowledge to context
what are the essential steps in counseling?
- 1. assessing the clients problems
- 2. conceptualizing and defining the clients problems
- 3. selecting and implementing effective treatments
- 4. evaluating the counseling
what is assessing the clients problems?
- -collaborative relationship
- -confirmatory bias
what is confirmatory bias?
a tendency to search for or interpret information in a way that confirms one's own beliefs
what is conceptualizing the client's problems?
what is selecting and implementing effective treatments?
educating clients about purpose of assessment
what is evaluating the counseling?
impacts legislation, public perception, professional identity
assessment can be therapeutic in that . . .
it can be used to further build rapport with clients
therapeutic assessment model
assisting clients in decision making
what is therapeutic assessment model?
better outcomes, improved perception of couselor
shift away from traditonal thinking of role of assessment
what are types of assessment tools?
- standardized vs non-standardized
- individual vs group
- objective vs subjective
- verbal vs nonverbal
- speed vs power
- cognitive vs affective
what is individual vs group?
quantity vs quality
what is objective vs subjective?
bias of researcher vs manualized administration
what is verbal vs nonverbal?
what is speed vs power?
difficulty of items completed vs # of items completed
what is cognitive instruments?
cognition, perceiving, processing, concrete & abstract thinking, remebering
- -intelligence/general ability tests
- - achievement tests
- -aptitude tests
what is affective instruments?
interest, attitudes, values, motives, temperament, non-cognitive aspects of personality
- -structure personality instruments
- -projective techniques
when was early testing done?
- greeks - 2500 years ago
- chinese - 2000 years ago
who was credited with launching the testing movement - emphasis on sensory perception and intelligence?
who was credited with founding the science of psychology - furthered intelligence testing?
who expanded testing to include memory and other sample mental processes?
james mckeen catell
what is binet-simon scale (1905)?
- -assessed judgement, comprehension & reasoning
- -ratio of mental age to chronological age (IQ)
what edition is the stanford-binet scale on?
world war 1 - group testing (army alpha & army beta)?
use to assign work role/tasks
frank parson's "father of guidance"?
promoted more systemic views of assessment and career
theoretical basis became a debating concerning definition of?
interest in assessment spread beyond intelligence - led to development of self-report personality inventories such as?
rorschach inkblots developed in 1921
aptitude tests developed for ?
selecting and classifying industrial personnel
standford achievement test (1923) was ?
first standardize achievement battery
the first edition of mental measurements yearbook was when?
dissatisfaction with exisitng persoanlity instruments led to :
- projective techniques became popular
- MMPI developed (early 1940)
- presented individual from "faking" results
standardized achievement tests well-established in public schools?
multiple aptitude batteries appeared after 1940
criticisms of assessment began to emerge:
- need for standards (APA)
- need for centralized test publication
- condensing of testing agencies/publishers allowed for electronic scoring
- increased accessibility of tests
examination and evaluation of testing and assessment - widespread public concern:
- misuse of testing isntruments
- cultural biases
family educational rights and privacy act (1974):
- mandated right to view educational records, including testing
- required permission of parents for many types of testing
use of computers blossomed : administration, scoring, interpretation, computer-adapted testing, report-writing-
customized testing based on test-taker
revision of instruments in response to criticism
increase in cultural diversity and awareness of test bias
increasing use of authentic and portfolio assessment
educational setting s- evaluation through multiple tests instruments, assessments
in 1900's what happened?
- binet-simons cale
- standford binet scale
- world war 1 - group testing
- frank parsons "father of guidance"
in 1920's and 1930's what happened?
- theoretical basis became a debating concerning definition of intelligence
- interest in assessment spread beyond intelligence - led to development of self-report personality inventories
- aptitude tests developed for selecting and classifying industrial personnel
- development of vocational counseling instruments
- stanford achievement test
- first edition of mental measurements yearbook
what happened in 1940s and 1950s?
- dissatisfaction with existing personality instruments
- standarized achievement tests well-established in public schools
- criticisms of assessment began to emerge
what happened in 1960s and 1970s?
- examination and evaluation of testing and assessment - widespread public concern
- 1970's grassroots movement for "minimum competency" testing for high school graduates
- family educational rights and privacy act
- increased use of computes in assessment
what happened in 1980's and 1990s?
- use of computers blossomed: administration, scoring, interpretation, computer-adapted testing, report-writing
- revision of instruments in response to cricism
- increasing use of authentic and portfolio assessment
what happend in 2000s to the present?
- influences on technology and the internet
- research on multicultural issues
- achievement testing & no child left behind
- increased intetest in accountability and effectiveness data
- revision of standards and DSM
measurement scale in which numbers are assigned to attributes of objects or classes of objects solely for the purpose of identifying the objects
a scale on which data is shown simply in order of magnitude since there is no standard of measurement of differences
what is interval?
scale of measurement for which the difference between two points on the scale is meaningful.
what is ratio ?
- The ratio scale
- of measurement is similar to the interval scale in that it represents
- quantity and has equality of units. One difference is that this scale
- also has an absolute zero, with no numbers existing below the zero.
- individual's score is compared to performance of others who have taken the same instrument (norming group)
- example : personality inventory
- evaluating the norming group by size , sampling, and representation
criterion referenced instruments -
- individual's performance is compared to specific criterion or standard
- example : third-grade spelling tests
- how are standards determined? common practice, professional organizations or experts, empirically-determined
what is frequency distribution?
mathematical function showing the number of instances in which a variable takes each of its possible values
what is frequency polygon?
A graph made by joining the middle-top points of the columns of a frequency histogram
diagram consisting of rectangles whose area is proportional to the frequency of a variable and whose width is equal to the class interval
what is mode?
most frequent score
what is median?
evenly divides scores into two halves (50% of scores fall above, 50% fall below)
what is mean?
arithmetic average of the scores
what are measures of variability?
range, variance, standard deviation
what are measures of central tendency?
mean, median, mode
what is range?
highest score minus lowest score
what is variance?
sum of squared deviations from the mean
what is standard deviation?
square root of variance
what is normal distribution?
a function that represents the distribution of many random variables as a symmetrical bell-shaped graph
what is skewed distribution?
bell-shaped curve with the most scores in the middle and tapering off toward the higher and lower scores. A skewed distribution is when the most scores fall to either the high or low end instead of in the middle
what are types of scores?
- raw scores
- percentile scores/percentile ranks
- standard scores
what are standard scores?
z scores, t scores, stanines, age/grade-equivalent scores
how can we interpret percentiles?
98th percentile - 98% of the group had a score at or below this individual's score
- 32nd percentile - 32% of the group had a score at or below this individual's score
- if there were 100 people taking the assessment, 32 of them would have a score at or below this individual's score
- units are not equal
- useful for providing information about relative position in normative sample
- not useful for indicating amount of difference between scores
what is z score?
how many standard deviations from the mean a value is
what is t score?
scores scaled to have a mean of 50 and a standard deviation of 10
what is stanines?
method of scaling test scores on a nine-point standard scale with a mean of five and a standard deviation of two
what are problematic scores?
- age-equivalent scores
- grade-equivalent scores
- problematic because do not reflect precise performance on an instrument
- learning does not always occur in equal developmental levels
- instruments vary in scoring
adequacy of norming group depends on:
- clients being assessed
- purpose of the assessment
- how information will be used
- examine methods used for selecting group
- examine characteristics of norming group
methods for selecting norming group:
- simple random sample
- stratified sample
- cluster sample
what is stratified sample?
method of sampling from a population
what is cluster sample?
technique that generates statistics about certain populations
norming group characteristics -
- educational background
- socioeconomic status
assessment report -
goal is to gain experience giving a formalized assessment in a safe situations and then using the results to decide treatment interventions and therapeutic goals - less emphasis on the construction of the instrument
instrument review -
goal is to understand more about how a particular instrument is created - by whom, how, psychometric, the intended use, and your best clinical opinions of the quality of the instrument based on what you know thus far
classical test theory
every observed score is a combination of true score/ability and error
- reliability coefficient (ex 0.80) - observed variance vs error variance
- systematic versus unsystematic error
- reliability only takes unsystematic error into account
often based on consistency between two sets of scores
- statistical technique used to examine consistency (relationship between scores)
- correlation coefficient : ranges from -1.00 to +1.00 (.00 indicates lack of a correlation or evidence)
is a relationship between two variables such that their values increase or decrease together
A relationship between two variables in which one variable increases as the other decreases
pearson-product moment correlation coefficient -
- most common (most practical sense, +, -, large, small)
- examines z scores between administrations (product of variance) and the number of individuals
coefficient of determination -
the percentage of shared variance between two sets of data
types of reliability -
- alternate/parallel forms
- internal consistency measures
what is test/retest?
variation between first and second administration
what is alternate/parallel forms?
two versions of test, ca n be administered closer together, eliminates memorized responses
what is itnernal consistency measures?
- split half - superman-brown formula
- kuder-richardson formulas
- coefficient alpha (cronbachs alpha) for non-dichtonous test items
nontypical situations -
typical methods for determining reliability may not be suitable for :
- speed tests
- criterion-referenced tests
- subjectively-second instruments (inter-rater reliability)
evaluating reliability coefficients -
- examine purpose for using instrument
- be knowledgeable about reliability coefficients of other instruments int that area
- examine characteristics of particular clients against reliability coefficients
- number of times of relaibility testing and type of reliability measures
standard error of measurement -
provides estimate of range of scores if someone were to take instrument repeatedly (attempt to indicate an individuals 'true score'
- concerns what instrument measures and how well it does so
- not something instrument "has or does not have"
- informs counselor when it is appropriate to use instrument and what can be inferred from results
- reliability is a prerequisite for validity
traditional categories of validity -
- content related - items represent intended behavior
- criterion - related - how instrument relates to an outcome (SAT)
- construct - extent to which instrument measures theoretical or hypothetical construct
evidence based on test content -
- degree to which the evidence indicates that items, questions, or tasks adequately represent intended behavior domain
- central focus is typically on how the instrument's content was determind
- content-related validation evidence should not be confused with face validity (instrument appears to be a good measure)
evidence based on response processes -
- concerns whether individuals respond in a manner consistent with construct measured
- how do individuals think aloud
- may also examine information processing differences by subgroup
evidence based on internal structure -
- measuring one construct or the use of scales and subscales?
- examining internal structure using factor analysis (comparison of subscale sets)
- can also examine internal structure of instrument for different subgroups (differential item functioning) - (example , women consistently get item correct on particular item while men consistently get it incorrect)
evidence based on relations to other variables -
- correlational method
- convergent evidence
- related to other variable which it should in theory be related to
- discriminant evidence
- not related to other variables from which it should differ
prediction/instrument-criterion relationship -
- concurrent validity - assesses current context, no gap in time - useful for diagnosing in session
- predictive validity - gap in time between administration and collecting criterion evidence
- decision theory
- cousin of correlation
- used to determine usefulness of variable or a set of variables in relation to other meaningful variable we are trying to predict
- regression line - instrument scores vs criterion - line of best fit
standard error of estimate -
- no perfect reliability or content - related validity
- the margin of expected error in the individual's predicted criterion score as a result of imperfect validity
- used to give clinicians a range for practical significance by acounting for error
decision theory -
- "group separation" or "expectancy tables"
- do the scores of the instrument correctly differentiate into performance or diagnostic realms
- gathering of how often the instrument is right, and how often it is incorrect
validity generalization -
- method of combining validation studies to determine if validity evidence can be generalized to new situations
- must have a substantial number of studies
- must use meta-analytiv procedures (exploration of many studies and different research factors)
evidence based consequences of testing -
- examples of social consequences :
- group differences on tests used for employment selection
- group differences in placement in special education
- desire to see context and system in which instrument was administered
- counselors should consider both validation evidence and social implications
conclusions on different types of validation evidence :
- gradual accumulation of evidence
- no clear decision of construct is or is not present
- consult validity prior to interpreting results
- counselor must evaluate the information to determine if it is appropriate for which client under what circumstance
- validation evidence should also be considered in informal assessments
item analysis -
- examining and evaluatin each item in instrument (validity refers to instrument as a whole)
- useful in development and revision process
- item difficulty (must consider other variables)
- item discrimination - does the item differentiate among responders on behavior domain
item response theory
- focus is on each item and establishing items that measure ability or level of a latent trait
- emphasis is on each individual item rather than the sum of the whole
- particular interest in which items the persons answers correctly or affirmatively
selection of assessment instruments/strategies
- determine what information is needed
- analyze strategies for obtaining information
- search assessment resources
- evaluate assessment strategies
- select an assessment instrument or strategy
determine information needed -
- identify - information needed for specific client, general information clinicians in an organization need about clients
- consider information already available
analyze strategies for obtaining information -
- formal or informal techniques
- consider which assessment method would be best suited to clients
- consider professonal limitations and which instruments counselor can ethically administer and interpret
search assessment resources
- mental measurements yearbook
- educational testing services
- test in print
- tests: a comprehensive reference for assessments in psychology, education and business
- directory of unpublished experimental mental measures
evaluate assessment strategies
- test purpose
- instrument development
- appropriate selection of norming group or criterion
evaluate assessment strategies
- interpretation of scoring materials
- user qualifications (level a, b,c)
- practical issues
administering assessment instruments
- read administration materials ahead of time
- attend to time limits
- know boundaries of what is acceptable
- use administration checklist, if helpful
- hand, computer, or internet scored (some can be self-scored)
- before using computer scoring, investigate integrity of scoring service and steps used to develop program
- some assessments require clinician judgement as part of scoring
scoring authentic / performance assessments
- involve performance of "real" authentic applications, rather than proxies
- objectivity in scoring is more difficult to achieve
- scoring is enhance if : assessment has specific focus, scoring plan is based on qualities that can be directly observed, scoring is designed to reflect intended target, the setting for assessment is appropriate, observers use checklists or rating scales, score procedures have been field tested before use
- often one of the most important parts of assessment process
- clients who receive test interpretation have greater gains than those who dont
- tentative interpretations more helpful than abosloute
- clients prefer individual interpretation
guidelines for communicating results -
- know information in manual (especially validity information)
- optimize the power of the test
- use effective general counseling skills
- develop multiple methods of explaining results
- use visual aids to explain technical terms
- use descriptive terms rather than numerical
- provide range of scores and rational for assessment
- use probabilities rather than certainties, tentative interpretations rather than abosoloutes
- discuss results in context of other information
- involve clients in interpretation
- monitor client reactions during interpretation
- encourage client to ask questions
- discuss limitations
- do not leave confused
communicating results to parents -
- be prepared to answer questions and explain results
- should understand testing used and symptoms of child's disorder
- help parents adjust to diagnosis
- be prepared to use variety of techniques
- focus on active, coping approach
- acknowledge parents emotions
- purpose: to disseminate assessment information to parents or other professionals
- valuate quality of reports before implementing suggested intervntions
- expect a comprehensive overview of client and interpreation of results in contextual manner
- should be carefully crafted with attention to detail
- be alert for typographical errors, us of vague jargon, careless mistakes
- identifying information
- reason for referral
- background info
- behavioral observations
- assessment results and interpretations
- diagnostic impressions and summary