Standard-setting methodology: Establishing performance standards and setting cut-scores to assist score interpretation.

A critical step in the development and use of tests of physical fitness for employment purposes (e.g., fitness for duty) is to establish 1 or more cut points, dividing the test score range into 2 or more ordered categories reflecting, for example, fail/pass decisions. Over the last 3 decades elaborated theories and methods have evolved focusing on the process of establishing 1 or more cut-scores on a test. This elaborated process is widely referred to as "standard-setting". As such, the validity of the test score interpretation hinges on the standard-setting, which embodies the purpose and rules according to which the test results are interpreted. The purpose of this paper is to provide an overview of standard-setting methodology. The essential features, key definitions and concepts, and various novel methods of informing standard-setting will be described. The focus is on foundational issues with an eye toward informing best practices with new methodology. Throughout, a case is made that in terms of best practices, establishing a test standard involves, in good part, setting a cut-score and can be conceptualized as evidence/data-based policy making that is essentially tied to test validity and an evidential trail.

[1]  Irvin R. Katz,et al.  Evaluating the Consistency of Angoff-Based Cut Scores Using Subsets of Items Within a Generalizability Theory Framework , 2015 .

[2]  David Docherty,et al.  Establishment of performance standards and a cut-score for the Canadian Forces Firefighter Physical Fitness Maintenance Evaluation (FF PFME) , 2014, Ergonomics.

[3]  M. Tipton,et al.  Physiological employment standards I. Occupational fitness standards: objectively subjective? , 2013, European Journal of Applied Physiology.

[4]  N. Gledhill,et al.  Erratum to: Physiological employment standards II: developing and implementing physical employment standards for safety-related occupations , 2013, European Journal of Applied Physiology.

[5]  Michael T. Kane,et al.  So Much Remains the Same: Conception and Status of Validation in Setting Standards , 2013 .

[6]  R. Hambleton Setting Performance Standards on Educational Assessments and Criteria for Evaluating the Process , 2013 .

[7]  Michael J. Zieky So Much Has Changed: How the Setting of Cutscores Has Evolved Since the 1980s , 2013 .

[8]  Jonathan Evans,et al.  Science Perspectives on Psychological , 2022 .

[9]  N. Gledhill,et al.  Developing legally defensible physiological employment standards for prominent physically demanding public safety occupations: a Canadian perspective , 2013, European Journal of Applied Physiology.

[10]  M. Kane Validating the Interpretations and Uses of Test Scores , 2013 .

[11]  Gregory J. Cizek,et al.  Setting performance standards : foundations, methods, and innovations , 2012 .

[12]  Bruno D. Zumbo,et al.  Validity and the Consequences of Test Interpretation and Use , 2011 .

[13]  Jon S. Twing,et al.  Standard-Setting Methods as Measurement Processes. , 2010 .

[14]  N. Gledhill,et al.  Construction, validation, and derivation of performance standards for a fitness test for correctional officer applicants. , 2010, Applied physiology, nutrition, and metabolism = Physiologie appliquee, nutrition et metabolisme.

[15]  Gergory J. Cizek,et al.  Standard Setting: A Guide to Establishing and Evaluating Performance Standards on Tests , 2006 .

[16]  W. Rogers,et al.  Establishing Performance Standards and Setting Cut-Scores , 2006 .

[17]  R. Hambleton,et al.  A Work-Centered Approach for Setting Passing Scores on Performance-Based Assessments , 2005, Evaluation & the health professions.

[18]  Nigel O'Brian,et al.  Generalizability Theory I , 2003 .

[19]  Robert L. Brennan,et al.  Raw-Score Conditional Standard Errors of Measurement in Generalizability Theory , 1998 .

[20]  Michael Kane,et al.  Choosing Between Examinee-Centered and Test-Centered Standard-Setting Methods , 1998 .

[21]  M. Kane Validating the Performance Standards Associated With Passing Scores , 1994 .

[22]  Michael T. Kane,et al.  An argument-based approach to validity. , 1992 .

[23]  M. Kane The Argument-Based Approach to Validation , 1990 .

[24]  Ronald A. Berk,et al.  Criterion-Referenced Measurement: The State of the Art , 1980 .

[25]  Gene V. Glass,et al.  Standards and Criteria* , 1978, Journal of MultiDisciplinary Evaluation.

[26]  Placy Rj,et al.  Setting performance standards. , 1976 .

[27]  A. Gadermann,et al.  Synthesis of Validation Practices in Two Assessment Journals: Psychological Assessment and the European Journal of Psychological Assessment , 2014 .

[28]  D. Kahneman Thinking, Fast and Slow , 2011 .

[29]  Bruno D. Zumbo,et al.  3 Validity: Foundational Issues and Statistical Methodology , 2006 .

[30]  A. Rupp,et al.  Responsible Modeling of Measurement Data for Appropriate Inferences: Important Advances in Reliability and Validity Theory , 2004 .

[31]  B. Zumbo,et al.  THE CONSTRUCTION AND USE OF PSYCHOLOGICAL TESTS AND MEASURES , 2003 .

[32]  L. Shepard Chapter 9: Evaluating Test Validity , 1993 .

[33]  L. Cronbach Five perspectives on the validity argument. , 1988 .

[34]  Samuel A. Livingston,et al.  Passing Scores: A Manual for Setting Standards of Performance on Educational and Occupational Tests. , 1982 .

[35]  G. Cizek,et al.  Setting performance standards : foundations, methods, and innovations , 2012 .