Performance appraisal and performance management: 100 years of progress?

We review 100 years of research on performance appraisal and performance management, highlighting the articles published in JAP, but including significant work from other journals as well. We discuss trends in eight substantive areas: (1) scale formats, (2) criteria for evaluating ratings, (3) training, (4) reactions to appraisal, (5) purpose of rating, (6) rating sources, (7) demographic differences in ratings, and (8) cognitive processes, and discuss what we have learned from research in each area. We also focus on trends during the heyday of performance appraisal research in JAP (1970-2000), noting which were more productive and which potentially hampered progress. Our overall conclusion is that JAP's role in this literature has not been to propose models and new ideas, but has been primarily to test ideas and models proposed elsewhere. Nonetheless, we conclude that the papers published in JAP made important contribution to the filed by addressing many of the critical questions raised by others. We also suggest several areas for future research, especially research focusing on performance management. (PsycINFO Database Record

[1]  A. R. Bass,et al.  Ethnic group differences in relationships among criteria of job performance. , 1973 .

[2]  William K. Balzer,et al.  Systematic distortions in memory-based behavior ratings and performance evaluations: Consequences for rating accuracy. , 1986 .

[3]  Robert E. Ployhart,et al.  Emergence of the Human Capital Resource: A Multilevel Model , 2011 .

[4]  N. Schmitt,et al.  Race and sex as determinants of the mean and variance of performance ratings , 1980 .

[5]  Kevin R. Murphy,et al.  Multiple uses of performance appraisal: Prevalence and correlates. , 1989 .

[6]  M. Taylor,et al.  Due Process in Performance Appraisal: A Quasi-Experiment in Procedural Justice , 1995 .

[7]  Gregory B Northcraft,et al.  Feedback and the rationing of time and effort among competing tasks. , 2011, The Journal of applied psychology.

[8]  H. John Bernardin,et al.  Strategies in Rater Training , 1981 .

[9]  E. Hollander,et al.  The reliability of peer nominations under various conditions of administration. , 1957 .

[10]  Chet Robie,et al.  A new look at within-source interrater reliability of 360-degree feedback ratings. , 1998 .

[11]  Paul K. Bergey,et al.  FORCED DISTRIBUTION RATING SYSTEMS AND THE IMPROVEMENT OF WORKFORCE POTENTIAL: A BASELINE SIMULATION , 2005 .

[12]  Angelo S. DeNisi,et al.  A cognitive view of the performance appraisal process: A model and research propositions , 1984 .

[13]  Wayne F. Cascio,et al.  Performance Appraisal Decisions as a Function of Rater Training and Purpose of the Appraisal , 1982 .

[14]  Angelo S. DeNisi,et al.  Performance Appraisal, Performance Management, and Firm-Level Performance: A Review, a Proposed Model, and New Directions for Future Research , 2014 .

[15]  J. Rentsch,et al.  Evaluating frame-of-reference rater training effectiveness using performance schema accuracy. , 2009, The Journal of applied psychology.

[16]  Herman Aguinis,et al.  THE BEST AND THE REST: REVISITING THE NORM OF NORMALITY OF INDIVIDUAL PERFORMANCE , 2012 .

[17]  A. W. Bendig The reliability of self-ratings as a function of the amount of verbal anchoring and of the number of categories on the scale. , 1953 .

[18]  Walter C. Borman,et al.  Effects of instructions to avoid halo error on reliability and validity of performance evaluation ratings. , 1975 .

[19]  F. E. Saal Mixed standard rating scale: A consistent system for numerically coding inconsistent response combinations. , 1979 .

[20]  H. John Bernardin,et al.  A clarification of some issues regarding the development and use of behaviorally anchored ratings scales (BARS). , 1981 .

[21]  D. Springer Ratings of candidates for promotion by co-workers and supervisors. , 1953 .

[22]  Andres Inn,et al.  Nonperformance influences on performance evaluations: A laboratory phenomenon? , 1981 .

[23]  Janet K. Swim,et al.  Evaluating Gender Biases on Actual Job Performance of Real People: A Meta-Analysis1 , 2000 .

[24]  Angelo S. DeNisi,et al.  A closer look at interpersonal affect as a distinct influence on cognitive processing in performance evaluations. , 1994 .

[25]  D. R. Ilgen,et al.  Performance Appraisal Process Research in the 1980s: What Has It Contributed to Appraisals in Use? , 1993 .

[26]  E. Sisson Forced Choice—The New Army Rating1 , 1948 .

[27]  Walter C. Borman,et al.  Examination of race and sex effects on performance ratings , 1989 .

[28]  Angelo S. DeNisi,et al.  Performance Appraisal, Performance Management and Improving Individual Performance: A Motivational Framework , 2006, Management and Organization Review.

[29]  M. Trost,et al.  Effects of category prototypes on performance-rating accuracy. , 1995 .

[30]  E. B. Knauft Construction and use of weighted check-list rating scales for two industrial situations. , 1948 .

[31]  Thomas H. Jerdee,et al.  The nature of job-related age stereotypes. , 1976 .

[32]  J. Lepkowski,et al.  Development of a forced-choice rating scale for engineer evaluation. , 1963 .

[33]  S. Jackson,et al.  An Aspirational Framework for Strategic Human Resource Management , 2014 .

[34]  Gary P. Latham,et al.  BEHAVIORAL OBSERVATION SCALES FOR PERFORMANCE APPRAISAL PURPOSES , 1977 .

[35]  J. C. Flanagan Psychological Bulletin THE CRITICAL INCIDENT TECHNIQUE , 2022 .

[36]  Kevin R. Murphy,et al.  Difficulties in the statistical control of halo. , 1982 .

[37]  James G. Combs,et al.  Does human capital matter? A meta-analysis of the relationship between human capital and firm performance. , 2011, The Journal of applied psychology.

[38]  Marian Baird,et al.  Strategic Human Resource Management , 2001 .

[39]  J. Obradović Modification of the forced-choice method as a criterion of job proficiency. , 1970 .

[40]  Horace Champney,et al.  The measurement of parent behavior. , 1941 .

[41]  William K. Balzer,et al.  Meaning and measurement of performance rating accuracy: Some methodological and theoretical concerns. , 1988 .

[42]  A. Denisi,et al.  A model of the appraisal process , 2008 .

[43]  Cecil J. Mullins,et al.  Rater accuracy as a generalized ability. , 1962 .

[44]  William K. Balzer,et al.  Rater errors and rating accuracy. , 1989 .

[45]  Michael J. Kavanagh,et al.  Issues in managerial performance: Multitrait-multimethod analyses of ratings. , 1971 .

[46]  Mark A. Huselid The Impact of Human Resource Management Practices on Turnover, Productivity, and Corporate Financial Performance , 1995 .

[47]  Heike Heidemeier,et al.  Self-other agreement in job performance ratings: a meta-analytic test of a process model. , 2009, The Journal of applied psychology.

[48]  L. Cronbach Processes affecting scores on understanding of others and assumed similarity. , 1955, Psychological bulletin.

[49]  Michael J. Kavanagh,et al.  Improving the accuracy of performance evaluations: comparison of three methods of performance appraiser training , 1988 .

[50]  L. Sulsky,et al.  Using frame-of-reference training to understand the implications of rater idiosyncrasy for rating accuracy. , 2008, The Journal of applied psychology.

[51]  P. Levy,et al.  The Social Context of Performance Appraisal: A Review and Framework for the Future , 2004 .

[52]  Scott B. MacKenzie,et al.  Organizational Citizenship Behavior and the Quantity and Quality of Work Group Performance , 1997, The Journal of applied psychology.

[53]  A. Denisi,et al.  Performance management around the globe: introduction and agenda , 2008 .

[54]  H. John Bernardin,et al.  Effects of rater training: Creating new response sets and decreasing accuracy. , 1980 .

[55]  Herman Aguinis,et al.  (www.interscience.wiley.com) DOI: 10.1002/job.493 The Incubator , 2022 .

[56]  James L. Farr,et al.  1 Performance Rating , 2007 .

[57]  Kevin R. Murphy,et al.  Correlates of perceived fairness and accuracy of performance evaluation. , 1978 .

[58]  W. H. Cooper,et al.  Conceptual similarity as a source of illusory halo in job performance ratings. , 1981 .

[59]  A. Kluger,et al.  Feedback effectiveness: Can 360-degree appraisals be improved? , 2000 .

[60]  Frank J. Landy,et al.  Statistical control of halo error in performance ratings. , 1980 .

[61]  Ernest J. McCormick,et al.  Paired comparison ratings: 2. The reliability of ratings based on partial pairings. , 1952 .

[62]  Angelo S. DeNisi,et al.  A Cognitive Approach to Performance Appraisal , 1997 .

[63]  Samuel Aryee,et al.  Impact of high-performance work systems on individual- and branch-level performance: test of a multilevel model of intermediate linkages. , 2012, The Journal of applied psychology.

[64]  Paul R. Sackett,et al.  Tokenism in performance evaluation: The effects of work group representation on male-female and White-Black differences in performance ratings. , 1991 .

[65]  E. Lawler,et al.  Employee reactions to a pay incentive plan. , 1973 .

[66]  Bruce J. Avolio,et al.  Race effects in performance evaluations: Controlling for ability, education, and experience. , 1991 .

[67]  P. C. Smith,et al.  Retranslation of expectations: An approach to the construction of unambiguous anchors for rating scales. , 1963 .

[68]  L. Ferguson The development of a method of appraisal for assistant managers. , 1947, The Journal of applied psychology.

[69]  Jeanette N. Cleveland,et al.  Perceived Fairness and Accuracy of Performance Evaluation: A Follow-Up. , 1980 .

[70]  Jack M. Feldman,et al.  Beyond Attribution Theory: Cognitive Processes in Performance Appraisal , 1981 .

[71]  J. Dejung,et al.  Some differential effects of race of rater and ratee on early peer ratings of combat aptitude. , 1962 .

[72]  H. John Bernardin,et al.  Effects of rater training and diary-keeping on psychometric error in ratings. , 1977 .

[73]  Paul R. Sackett,et al.  Rater−ratee race effects on performance evaluation : challenging meta-analytic conclusions , 1991 .

[74]  K. Murphy,et al.  Evaluating the Performance of Paper People , 1986 .

[75]  Kevin R. Murphy,et al.  Do behavioral observation scales measure observation , 1982 .

[76]  Kevin R. Murphy,et al.  Behavioral anchors as a source of bias in rating , 1987 .

[77]  Organization of information in memory and the performance appraisal process: evidence from the field. , 1996 .

[78]  Gary P. Latham,et al.  Effects of goal setting and supervision on worker behavior in an industrial situation. , 1973 .

[79]  W. Borman,et al.  Format and training effects on rating accuracy and rater errors , 1979 .

[80]  M. Ream A statistical method for incomplete order of merit ratings. , 1921 .

[81]  Robert M. McIntyre,et al.  Effect of rater training on rater accuracy: Levels-of-processing theory and social facilitation theory perspectives. , 1987 .

[82]  Thomas P. Cafferty,et al.  Organization of information used for performance appraisals: role of diary-keeping , 1989 .

[83]  Z. C. Dickinson Validity and independent criteria in tests and ratings. , 1937 .

[84]  Mabel Barrett A comparison of the Order of Merit method and the method of Paired Comparisons. , 1914 .

[85]  H. John Bernardin,et al.  Cognitive complexity and appraisal effectiveness: Back to the drawing board? , 1982 .

[86]  Edwin E. Ghiselli,et al.  THE MIXED STANDARD SCALE: A NEW RATING SYSTEM , 1972 .

[87]  M. Taylor,et al.  Consequences of individual feedback on behavior in organizations. , 1979 .

[88]  Robert G. Lord,et al.  Cognitive categorization and dimensional schemata: A process approach to the study of halo in performance ratings. , 1983 .

[89]  D. Paterson The scott company graphic rating scale. , 1922 .

[90]  W. V. Bingham Halo, invalid and valid. , 1939 .

[91]  Steven D. Jones,et al.  Effects of group feedback, goal setting, and incentives on organizational productivity. , 1988 .

[92]  P. Levy,et al.  Participation in the performance appraisal process and employee reactions: A meta-analytic review of field investigations. , 1998 .

[93]  J. Guilford,et al.  Some constant errors in ratings , 1938 .

[94]  R. Travers A critical review of the validity and rationale of the forced-choice technique. , 1951, Psychological bulletin.

[95]  H. Rugg Is the rating of human character practicable , 1921 .

[96]  W C Borman,et al.  Consistency of rating accuracy and rating errors in the judgment of human performance. , 1977, Organizational behavior and human performance.

[97]  A. Kluger,et al.  The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. , 1996 .

[98]  C. L. Jaffee,et al.  Effects of incentive, feedback, and manner of presenting the feedback on leader behavior. , 1974 .

[99]  Donald W. Fiske,et al.  The consistency of ratings by peers. , 1960 .

[100]  Angelo S. DeNisi,et al.  The role of appraisal purpose: effects of purpose on information acquisition and utilization , 1985 .

[101]  G. Yukl,et al.  Effects of multisource feedback and a feedback facilitator on the influence behavior of managers toward subordinates. , 2003, The Journal of applied psychology.

[102]  H. H. Meyer Methods for scoring a check-list type rating scale. , 1951, The Journal of applied psychology.

[103]  John M. Ivancevich,et al.  Longitudinal study of the effects of rater training on psychometric error in ratings. , 1979 .

[104]  Herman Aguinis Performance Management , 2005 .

[105]  Donald P. Schwab,et al.  Age stereotyping in performance appraisal. , 1978 .

[106]  George C. Thornton,et al.  Training to improve observer accuracy. , 1980 .

[107]  E. Hollander Validity of peer nominations in predicting a distant performance criterion. , 1965, The Journal of applied psychology.

[108]  Elaine D. Pulakos,et al.  A comparison of rater training programs: Error training and accuracy training. , 1984 .

[109]  E. Thorndike A constant error in psychological ratings. , 1920 .

[110]  John K. Butler,et al.  Lecture vs. group decision in changing behavior. , 1952 .

[111]  H. H. Remmers,et al.  Reliability and halo effect of high school and college students' judgments of their teachers. , 1934 .

[112]  Steve W. J. Kozlowski,et al.  The Systematic Distortion Hypothesis, Halo, and Accuracy: An Individual-Level Analysis , 1987 .

[113]  James W. Smither,et al.  CAN MULTI-SOURCE FEEDBACK CHANGE PERCEPTIONS OF GOAL ACCOMPLISHMENT, SELF-EVALUATIONS, AND PERFORMANCE-RELATED OUTCOMES? THEORY-BASED APPLICATIONS AND DIRECTIONS FOR RESEARCH , 1995 .

[114]  E. Lawler,et al.  The multitrait-multirater approach to measuring managerial job performance. , 1967, The Journal of applied psychology.

[115]  H. F. Rothe,et al.  Output rates among coil winders. , 1958 .

[116]  A. Denisi,et al.  The effect of performance appraisal salience on recall and ratings , 1990 .

[117]  W. Clay Hamner,et al.  Race and sex as determinants of ratings by potential employers in a simulated work-sampling task. , 1974 .

[118]  Kurt Kraiger,et al.  A meta-analysis of ratee race effects in performance ratings. , 1985 .

[119]  James W. Smither,et al.  DOES PERFORMANCE IMPROVE FOLLOWING MULTISOURCE FEEDBACK? A THEORETICAL MODEL, META‐ANALYSIS, AND REVIEW OF EMPIRICAL FINDINGS , 2005 .

[120]  Deborah DiazGranados,et al.  Author Notes , 1994, Schools of Thought.

[121]  D. N. Buckner,et al.  THE PREDICTABILITY OF RATINGS AS A FUNCTION OF INTER-RATER AGREEMENT , 1959 .

[122]  Herbert H. Meyer,et al.  Split Roles in Performance Appraisal , 1981 .

[123]  Kevin R. Murphy,et al.  Performance appraisal: An organizational perspective. , 1991 .

[124]  Brian R. Kay,et al.  The use of critical incidents in a forced-choice scale. , 1959 .

[125]  William K. Balzer,et al.  Relationship between observational accuracy and accuracy in evaluating performance. , 1982 .

[126]  Jeanette N Cleveland,et al.  Raters who pursue different goals give different ratings. , 2004, The Journal of applied psychology.

[127]  David J. Woehr,et al.  Understanding frame-of-reference training: the impact of training on the recall of performance information , 1994 .

[128]  James R. Berkshire,et al.  Forced-Choice Performance Rating—A Methodological Study* , 1953 .

[129]  Jeffrey B Vancouver,et al.  The effect of feedback sign on task performance depends on self-concept discrepancies. , 2004, The Journal of applied psychology.

[130]  R. Lord Accuracy in behavioral measurement: An alternative definition based on raters' cognitive schema and signal detection theory. , 1985 .

[131]  G. Yukl,et al.  Subordinate personality as a moderator of the effects of participation in three types of appraisal interviews. , 1973 .