MGMT969 Chapter 3 HR Measurement
Home > Flashcards > Print Preview
The flashcards below were created by user
on FreezingBlue Flashcards
. What would you like to do?
- Nominal - classification - ie classifying male and female (does not rank, there is no equality of difference or absolute 0)
- Ordinal - Supervisor ranking of subordinates - while it does order and classify (1- 10, 1 being most 10 being least there is no absolute 0 nor equlity of difference e.g. difference between 1 and 2 may not be the same as the difference between 3 and 4).
- Interval - employee rating of job satisfaction - scores have equality of difference and can be classified and ranked however there is no absolute 0. ie 0 in a math test does not mean the candidate has no math ability.
- Ratio - counting employee numbers of units produced. This can classify, order, show equality of difference and ther is an absolute 0. This is rarely used in HR selection.
- Interval and ratio scales are the most desirable because they are the most precise.
Most measurement in selection is indirect. That is, we cannot directly assess the person's actual level of mental ability or achievement motivation or even attitude toward a proposed career.
The person's standing on the unobservable trait (intelligence or work motivation) is infrred from the score on the test or the predictor itself. Although the unit of measurement for the underlying construct (a 25 item test assessing mental ability or achievement motivation) may only use an ordinal scale. Therefore the critcal issue is whether the scores - not the hyypothetical construct - use an ordinal or interval scale.
Measurement is critical because the use of meaningless numbers will result in meaningless conclusions, no matter how complex the statistical analyses. Ultimately, there is no substitutes for high-quality measurement methods. In selection, if the number signifiesdegrees of differences in units that reflect nearly equal interval units, not just more than or less than (ordinal), then you have a quality predictor orcriterion.
Keep in mind that such a conclusion is grounded uponthe assumption that these numbers can be meaningfully interpreted as representing the unobservable construct underlying the number. If this is true, you can reasonably subject these numbers to useful interval scale statistical analyses (eg. use of parametric stats including means, standard deviations, person product moment correlations and F-statistics)
There has been controversy about levels of measurement and the use of statistical analysis. The essence of the critics argument is that unless the number clearly represents an equal interval scale, one shouldstrictly adhere to the assumption that the numbers are ordinal at best. This means the type of statistics that are permissible are severly restricted eg, the use of means and standard deviations are not appropriate.
- A pragmatic view is that given the state of measurement in selection research. It is reasonable to treat most selection measures as if they were solely on an ordinal levelwould result in the loss of useful information.
- Conclusions based on interval level statistics will likely be quite useful.
Standardization of selction measurement - variety of measures used in hr resource selection which differ in quality of measurement.
usually refer to the sstematic application of preestablished rules or standars for assigning scores to the attributes or traits of an individual - (this is measurement in context of selection)
selection measure provides info to be used as a predictor (test) or a criterion (supervisors performance appraisal rating). Can involve any one of the 4 measurement scales (NOIR) - important characteristic of any selection measure is its ability to detect any true differences that may exist among individuals with regard to the attribute being measured. E.g.If true differences in knowledge of disease prevention exist among job candidates applying for a nurse practitioner's job, then a disease prevention knowledge test must be able to detect these differences. If test score differences in disease prevention are found, these sscore differences should be due to difference in the candidates' knowledge and not due to extraneous factors such as differences in the manner in which the test was given to the job candidates or the way in which it was scored. To help control such factors, systematic or STANDARDIZED measurement is needed in human resource selection.
A predictor or criterion measure is standardized if it possesses each of the following characteristics:
- 1. Content - all persons being assess are measured by the same information or content - this includes the same format (eg. mC, essay) and medium (eg paper and pencil, computer, video.
- 2. Administration - Information is collected the same way in all locations and across all administrators, each time the selection measure is applied.
- 3. Socring - Rules for scoring are specified before administering the measure and are applied the same way with each application. E.g. if scoring requires sbjective judgement, steps should be taken (such as rater training) to ensure inter-rater agreement or reliability.
No matter how many times a measure is administered or to whom it is given, the same content, administration and scoring results from the use of a standardized measurement procedure.
Standpoint of professional practice - measures that are truly standardized is more an ideal than reality.
Standardization is a goal we strive for but do not always attain. Sometimes it may even be necessary to aleeter the standardization procedure. Eg. it may be necessary to modify the administration of a test in order to accommodate the special needs of a disabled job applicant.
Principal roles of a manager involved in HR selection = deciding if an aplicant should/should not be hired.
Managers use predictors and criteria variables when making selection decisions.
Criteria (eg supervisory rating of job performance or number of errors) are employed as part of a research study designed to determine if the predictors are really measuring thos apects of job success that the predictors were designed to predict. - This is known as VALIDATION STUDY
- Criterion measures really help serve as a standard for evaluating how well predictors do the job they were intended to do. Predictors have a DIRECT impact on the decisions reached by a manager involved in HR selection decisions.
- A manager actually reviews an applicant's scores on apredictor and uses this information in deciding whether to accept or reject the applicant. Critera play an INDIRECTrole in that they determine which predictors should actually be used in making selection decisions.
Selection measures mean BOTH predictors and criteria.
- Predictors - numerous types of predictors have been used to forecast employee performance. Major trends fall into three categories:
- 1. background information
- 2. Interviews
- 3. Tests
- Criteria - classifyed by measurement methodes used to collect data:
- 1. Objective production data
- 2. Personnel data
- 3. Judgemental data
- 4. Job or work sample data
- 5. Training proficiency data
Background information (predictor)
Application forms, reference checks and biographical data questionnaires are generally used to collect information on job applicants and their backgrounds. APPLICATION FORMS typically ask job applicants to describe themselves and previous work histories. Applicants are usually asked about current address, previous education, past employement, relevant licenses etc. Prospective employers make REFERENCE CHECKS by contacting individuals who can accurately comment on the applicant's characteristics and background. These checks are often used to verify information stated on the application form as well as to provide additional data on the applicant. BOPGRAPHICAL DATA QUESTIONNAIRES consist of questions about the applicant's past life history. Pas life edxperiences are assumed to be good indicators of future work behaviours.
Employement interviews are used to collect additional information about the applicant. The employement interview principally consists of questions asked by a job interviewer. Responses are used for assessing an applican's suitability for a job and fit to the organization.
- Literally hundreds of tests have been used for selection purposes. There are many different ways to classify the types of tests availabe. the descriptive lables indicated here will give you a feeling for the range of options available. Aptitude, or ability tests for example are used to measure how well an individual can perform specific parts of a job. Abilities measured by aptitude tests include intellectual, mechanical, spatial, perceptual, and motor. Achievement tests are employed to assess individual's proficiency at the time of the testing (eg job proficiency or knowledge). Personality testes in the context of selection are used to identify those candidates who will work harder and cope better at work, which should also relate to success on the job.
- administration of tests will vary, different ways to take the test (e.g. written, computer) or time frames (e.g. time limit or not) or practical or demonstrative (manipulation of physical objects or performing a task). some group setting and some individual.
criterion measures usually assess some kind of behaviour or performance on the job (eg org citizenship behaviours, sales, turnover, supervisory ratings of performance) that are important to the organisation. Called creiterion measures because theyare used to evaluate the predictors used to forecast performance. Therefore they are the criteria for evaluating selection procedures. Although these measures are based on behaviour - what the employee does - the observed behaviour is a measure of a theoretical construct (eg "true" job performance. Consequently criterion measures are susceptible to the same measurement concerns as are written tests or employment interviews.
Thger criterion is also concerned with measuring things that can be used to assess the usefulness of the predictors. The quality of these inferences is based on the scale of measurement. Consequently, it is critical that the selection expert ensures the criterion approximates at least an interval scale. Numerous criteria are available to test important aspects of work behaviour. One way way to classify these criteria is by the measurement methods used to collect the data.
objective production data (criteria measurement data)
These data tend to be physical measure of work. Number of goods produced, amount of scrap left, and dollar of sales are examples of objective production data.
Personnel records and files frequently contain information on workers that can serve as important criterion measures. Absenteeism, tardiness, voluntary turnover, accident rates, salary history, promotions, and special awards are examples of such measures.
Performance appraisals or ratings frequently serve as criteria in selection research. They most often involve a supervisor's rating of a subordinate on a series of behaviours or outcomes found to be important to job success, including task performance, citizenship behaviour, and counterproductive behaviour. Supervisor or rater judgments play a predominant role in defining this type of criterion data.
Job or work sample data
Theses data are obtained from a measure developed to resemble the job in miniature or sample of specific aspects of the work process or outcomes (for example, a typing test for a secretary). Measurements (for example, quantity and error rate) are taken on individual performance of these job tasks, and these measures serve as criteria.
Training Proficiency data
This type of criterion focuses on how quickly and how well employees learn during job training activities. Often, such criteria are labeled TRAINABILITY measures. Error rates during a training period and scores on training performance tests administered during training sessions are examples of training proficiency data.
criteria for evaluating selection measures
choosing measure to be employed as predictors and criteria indicative of job success. One of several questions to ask is what characteristics should I look for in selecting a predictor or a criterion?
Questions important for predictors and criteria
- Some are important - checklist for reviewing for each measure considered - Unless you can be sure that a measure meets these standards, you may not have the best measure you need for your selection research. there are 2 options
- A - determine whether you can adjust your data or the way you calculate the score itself so it will meed each measurement evaluation criterion (eg ask supervisior to rate each employee's perfromance on a 9 poin scale instead of rank ordering employees or
- B - if option a is not viable - find or develop another, more suitable measure for the underlying construct.
The factors are not of equal importance. Issues concerning reliability, validity and freedom from bias are obviously more critical than administrative concerns such as acceptability to management. Other factors will vary in importance depending on the specific situation. Regardless, the wise selection manager will give serious consideration to each one.
Questions for Predictors?
- 1. What does the predictor measure?
- 2. Is the predictor cost effective?
- 3. Has the predictor been standardized (scoring, timing, administering etc)
- 4. Is the predictor easy to use? does it need high levels of training? - can be costly.
- 5. Is the predictor acceptable to the organization? To management? To the candidate? (content can be offensive or invasion of privacy - controversial, unacceptable?)
Questions for Critera?
- 1. Is the criterion relevant to the job for which it is chosen?
- 2. Is the criterion acceptable to management? e.g. if revenue is more important than actual sales will be more important that saleability.
- 3. Are work changes likely to alter the need for the criterion? If work changes are rapid of fluid organisation wide criteria such as turnover or effectively dealing with time deadlines may be better reflection of success - criterion should be assessed regularly as they are subject to change.
- 4. Is the criterion uncontaminated and free of bias, so that meaningful comparisons among individuals can be made? comtaminating variables include differences between sales territories, differences in tools and equipment, differences in work shifts and differences in the physical conditions of the job. -group bias occurs when a group characteristic is assumed to relate to individual employee performance. common examples of this kind of bias include age and job tenure, which are often assumed to be related to performance.
- 5. Will the criterion detect differences amond individuals if differences actually exist (discriminability)? Are meaningful differences amond individuals actually scored with respect to the criterion? will the criterion actually show the differences between the employees - if it shows outstanding for all when only 20% are outstanding then it is not showing the differences that actually exist.
Questions for predictors and criteria
- 1. Does the measure unfairly disciminate against sex, race, age or other protected groups?
- 2. Does the measure lend itself to quantification? numbers on interval or ratio scales are prefered because one can use a more sophisticated statistical analyses.
- 3. Is the measure scored consistently? - need to have specific rules and procedures. All raters should receive same training and same opportunites to observe and evaluate.
- 4. How reliable are the data provided by the measure? Scores should remain consistent if reapplied (ie score of 95 should yeild a score of approximately 95 the next time).
- 5. How well does the device measure the construct for which it is intended (construct validity)? The obtained scores on the predictor criterion should assess what they are meant to (eg. Intelligence or job performance).
Locating existing selection measures
There are several advantages to finding and using existing selection measures.
- 1. Less expensive and less time consuming than developing new ones.
- 2. If previous research was conducted, we have some idea about reliability, validity and other characteristics of the measures.
- 3. existing measures ofteen will be superior to what could be developed in-house.
There are many types of measures available. The vast majority of them e.g. tests, are intended to be used as predictors.
- Commercially available predictors include:
- intelligence, ability, interest, personality inventories.
- Developed: application blanks, biographical data questionnaires, reference check forms, employment interviews and work sample measures.
- Criteria measures are generally not published and will probably have to be constructed by a user.
Text and reference sources
- Annual review of psychology
- Research in personnel and Human resources management
- There are also several other books on technical aspects. Reviews of current selection research also published.
Buros Mental Measurements Yearbooks
- Most important source for information on tests for personnel selection. It contains critical reviews by test experts and bibliographies of virtually every test printed in English. Oscar Buros was editor of the Yearbook for many years. historically issued every 6 years no every 18-24 months. most recent sexteenth mental measurements yearbook (2005)
- other supplementary books includ Tests in Print (TIP) I-VI which serves as a comprehensive bibliography to all known commercially available tests currently in print.
TIP Tests in Print I-VI
Revies provide vital information to users, including test purpose, test publisher, inprint status, price, test acronym, intended test population, administration times, publication dates and test authors. A score index permits users to identify the specific characteristics measure by each test.
Other reference source
- 1. Tests: a comprehensive reference for assessments in psychology, education and business (fifth edition) describing more than 2000 assessment instruments categorized in 90 subcategories. This reference includes info on each instruments purpose, scoring procedures, cost and publisher. It does not critically review or evaluate tests. Evaluations are provided in the companion volume, test critiques.
- 2. Test Critiquest Volumes I-XI provide evaluations to more than 800 of the most widely used tests in terms of practical applications and uses technical aspects (validity, reliability, normative data, scoring). Also provides a critique of the instruments.
Educational Testing Service (ETS) of princeton NJ
- has several reference sources available eg Test collection database with library of more than 25, 000 tests and surveys.
- Education resources information center (ERIC) - 1.1 mil citations going back to 1966 - largest of its kind (web site bibliographic database)
- Journal of Applied psychology (JAP)
- Personnel Psychology
- other less prominant include:
- International journal of Selection and Assessment, Human Performance, Journal of Occupational and Organizational Psychology, The Industrial-Organizational Psychologist (TIP), Educational and Psychological Measurement, Applied Psychological Measurement
- ATP - Association of test publishers (non profit - 1992) represents a large number of test publishers and providers of assessment services.
- Tries to uphold a high level of professionalism and business ethics related to the publishing of assessments.
American Psychological Association three levels of classification
- Level A- This level consists of those tests that require very little formal training to administer, score and interpret. Most selection practitioners may purchase tests at this level. A typing test is representative of tests in this classification.
- Level B - Tests classified in this category require some formal training and knowledge of psychological testing concepts to properly score and interpret. Aptitude tests designed to forecast an individual's performing potential are of this type. Individuals wishing to purchase Level B tests must be able to document their qualifications for correctly using such tests. These qualifications usually include evidence of formal education in tests and measurement ass well as in psychological statistics.
- Level C - These tests require the most extensive preparation on the part of the test administrator. In general, personality inventories and projective techniques make up this category. A Ph. D in psychology and documentation of training (e.g. course taken on particular test) are required. Tests in this category tend to be less frequently used in selection contexts than tests in levels A and B.
- APA (American Psych Asso) not a source of selection measures but provides an excellent treatment of the considerations in selecting and using tests.
- American Banking Associating
- LIMRA - insurance companies
- American Foundation for the bind (variety of info and some selection measures for visual impaired)
Minimum recommendations for choosing an existing selection measure
- 1. Be sure you clearly and completely understand the attribute or construct you want to measure. Decide on the best means for assessing the attribute eg paper and pencil, work sample etc.
- 2. SEarch for and read critical reviews and evaluations of the measure. References mentioned earlier (e.g. Buros Institued and Pro Ed pubs) are excellent sources of such reviews. These sources should be able to provide background information on the test, including answers to relevant questions: What theory was the test based on? What research was used to develop thetest? This is important because it provides information the the logic, care and thoroughness with which the test was developed. It should allow you to assess how relevant this information is with regard to you organization's applicants or employees.
- 3. If a specimen set of the measure (including a technical manual) is available, order and stud it. Ask yourself "Is this measure appropriate for my intended purpose?", In answering this question, study carefully any information relative to the measure's reliability, validity, fairness, intended purpose, method of administration, time limits, scoring, appropriateness for specific groups, and norms. If such information is not available, you may want to consider other measures.
- 4. Once you have completed steps 1 through 3, ask yourself, "Based on the informaiton I reviewed, are there compelling arguments for using this measure? Or, on the other hand, are there compelling arguments against using it?
HR professionals raise legitimate concerns about whether it is reasonable to expect practitioners to develop selection measurres, particularly in light of the technical and legal ramifications associated with them, and whether an organization has the resources, time and expertise to develop such measures.
Most HR will not have resources and knowledge to construct selection measures. Consultants will likely be needed. Material is presented to provide the basic issues involved so as to monitor and evaluate the work of a consultant and bridge any possiblee communications gap between the org and consultant.
Steps in Developing Selection Measures
- 1. Analyzing the job for which a measure is being developed.
- 2. Selecting the method of measurement to be used.
- 3. Planning and developing the measure.
- 4. Administering, analyzing, and revising the preliminary measure.
- 5. Determining the reliability and validity of the revised measure for the jobs studied.
- 6. Implementing and monitoring the measure in the human resource selection system.
Analysis of work (step 1 in developing selection measures)
- Most crucial
- After analyzing the work itself - should be able to specify constructs/knowledge, skill and ability domains the predictor will assess (critical - if not correct all steps flawed)
- Modern organizations find nature of work employees perform chaning to rapidly for traditional job analysis.
- Technological advances, changing nature of social setting (e.g more interaction with customers, team members or independent contract workers) and external environment (offshoring of work etc).
- Today there is no "one size fits all" job analysis procedures.
- Consequently job analysis is to determine KSA's and other characteristics necessary for adequately performing a specific job, or to identify employee competencies and types of work activites from a broader analysis of the work.
- In addition to its role in developing and selecting selection measure, analysis of work also provides the foundation for developing criteria measure of job performance. We cannot do systematic research on the selection of personnel until we know which of our selected applicants have become successful employees. Intimate knowledge of the work gained through analysis of work will help us to identify or develop measure of job success.
- Ultimately, through validation research we will determine the extent to which our selection measure can actually predict these criteria.
- The more researchers know about a specific job, the more likely it is that their hypostheses and ideas about selection measure will have preictive utility for that job. Therefore researchers must choose the method ofanalysis they believe will yield the most useful information about work.
Selecting measurement method - Step 2 in developing selection methods.
- after id what to measure (KSA or other characteristics)
- Host of methods availabe for selection (paper and pencil tests, job or work sample tests, ratings, Interviews, biographical data questionnaires)
- Variables that will affect selection of a measurement medium: specific nature of job (tasks performed, level of responsibility), skill of test administrators,number and types of applicants, cost of testing, resurces available.
- The method chosen will ultimately depend on the job and organizational context in which the job is performed.
- checklists are useful to match selection methods with job requirements.
Planning and developing the selection measure (step three in developing selection measures)
- This step involves the researcher attempting to clarify the nature and details of each selection measure and prepares an initial version of our measure. The specifications required for each measure considered should include the following:
- 1. The purposes and uses the measure is intended to serve
- 2. The nature of the population for which the measure is to be designed.
- 3. The way the behaviours or KSAO - other attributes- will be gathered and scored. This includes decisions about the method of administration, the format of test items and responses, and the scoring procedures.
- SME's (Subject matter experts) can create the items orrewrite them. Also review in regards to appropriateness of item content and format for fulfilling its purpose etc.
- Administration methods and scoring procedures should also be developed. there is fixed format or open ended format. Fixed format is most popular as it is most efficient with few or no scoring errors and can easily and reliabily be transformed into a numerical scale for scoring purposes. Free response is useful to provide greater detail or richer samples of candidates behaviour and allows for unique characteristics to emerge (e.g. creativity)
- Explicit scoring of the measure is particularly critical - well developed hiring tools will provide an 'optimal' score for each item that is uniformally applied.
- was to standardize administration include time limits for completion. Consideration must be given to reasonable accomodations to administration based on the Disabilities Act eg. oral answers for a written test for blind candidates.
- standardizing is important and must be balanced with fairness for all candidates.
- Samplying designs and statistical procedures are to be used in selecting and editing items, questions and other elements of the measure. Selection devices require a number of steps to ensure more accurate measurement. SMES assure job relevance and help screen out objectional items that might produce bias. Development may include pilot testing that is double checked with a hold out sample (distorted results can occur if tested on the same group twice)
- Be wary of tests that consist of a limited number of items (e.g. a 15 item test that purports to measure five dirrerent factors) Measuring an attribute or trait consistently and accurately requires more than just a couple of items.
step four administering, analyzing and revising
- Should do sample tests with candidates that are similar in demographics etc as test designed candidates.
- Sample should have minimum 100 ppl prefer a few hundred
- Answers should be analyzed to see if test is useful (fair, effective, tests what it should test).
- Data analysis will give statistics to be analysed and summarized.
- psychometric characteristics:
- 1. The reliabilit or consistency of scores on the items
- 2. the validity of intended inferences
- 3. Item fairness or differences among subgroups.
- Number of considerations must guide the item analysis and revision process.
- Developer of the test must possess extensive technical knowledge about test development
- Quality of times used in test has significant impact on overall quality of the measure.
- CRITICAL STEP
Step 5 - determining reliability and validity -
- after revision comes relliability and validity studies to test the hypothesis.
- Three questions to answer:
- 1. Are the scores on our selection measure dependable for selection decision making purposes?
- 2. Is the selection measure predictive of job success?
- 3. Does the test measure what we think it measures?
Step 6 Implementing the selection measure
- After reliability and validity evidence, we can implement measure.
- Cutoff or passing scores may be developed.
- Norms or standards for interpreting how various groups score on the measure will be developed to help interpret results. after implemented - continued monitoring of performance to ensure it is performing the function intended.
- It can easily take 3-5 years to develop a commercially successful test.
- Thre reasons why selection researcher frequently use suitable existing measures rather than develop own:
- 1. Technical ramifications
- 2. Associated costs
- 3. Potential for legal action.
Interpreting scores on selection measures
- 1. Using Norms
- 2. Using Percentils
- 3. Using Standard scores
- need two essential pieces of information
- 1. how others scred on the measure
- 2. the validity of the measure
- In order to attach meaning to a score we must compare it to the scores of others who are similar with regard to characteristics such as education level, type of job, amount of work experience.
- Norms show how someone performed in respect to scores of relevant others.
- Normative sample or standardization group is a group of persons for whom test score information is available and used for comparison purposes.
- Things to keep in mind -
- Norm group should be RELEVANT for the purpose used.
- Employers should use LOCAL norms as soon as 100 or more scores in a particular group have been accumulated.
- Norms are TRANSITORY - specific to the point in time they were collected. older norms become more and more irrelevant.
- normative data (expressed in percentiles and standard scores) are helpful in interpreting scores on selection measuresit does not tell us what a score means in terms of important job behaviours or criteria of job performance.
- Most frequently used in reporting normative data because the purpose of a norm is to show relative standing in a group. Percentile scores are derived to show the percentage of persons in a nor group who fall below a given score on a measure. Generally the higher the percentile the better a persons performance relative to others in the normative sample.
- Useful in interpreting test scores but subject to misues. Percentile scores are on an ordinal scale and have no equality of difference, however, it is misused when people interpret the scores as interval scales. We can make greater than or less then statements comparing scores but cannot say how much higher or lower one percentile score is from another.
- Standard scores are entered as a z score, which represents the difference between an individual score and the mean, in units of the standard deviation.
using standard scores
- In general standard scores represent adjustments to raw scores, so it is possible to determine the proportion of individual who fall at various standard score levels.
- z score - z = (X-M) / SD
- Computation result in scores that range from -4 to+4 this can be used to show if you are 1 point above or below the mean which is represented by 0.
- T scores are used to not have negative scores (all positive). T-10z+or-50 therfore a z score of 1 would be 60.
- stanine scores are another form of standard score using numbers ranging from 1-9 to represent normative score performance.
- Biggest problem of using is that they are open to misinterpretation.
What would you like to do?
Home > Flashcards > Print Preview