Establishing and applying performance standards for curriculum-based examinations

This paper describes how a state education system in Australia introduced standards-referenced assessments into its large-scale, high-stakes, curriculum-based examinations in a way that enables comparison of performance across time even though the examinations are different each year. It describes the multi-stage modified Angoff standard-setting procedure used to establish cut-off scores on subject examinations, and how the results from this exercise were then used to develop standards packages. These packages illustrate the performances of students at the borders between the various bands. The paper also shows how originally it was intended to use a Rasch measurement model to create the statistical feedback used in the standard-setting procedure. It also describes the modifications to the feedback that were necessary to meet the real-time constraints of this large-scale examination programme. It argues that consideration should now be given to using the Rasch model to provide this feedback instead of the current approach.

[1]  John D'Arcy,et al.  Setting Reliable National Curriculum Standards: a guide to the Angoff procedure , 1994 .

[2]  Graeme Hutcheson,et al.  Rasch Models for Measurement , 2011 .

[3]  Richard M. Jaeger,et al.  An Iterative Structured Judgment Process for Establishing Standards on Competency Tests: Theory and Application , 1982 .

[4]  Paul R. Brandon,et al.  Conclusions About Frequently Studied Modified Angoff Standard-Setting Topics , 2004 .

[5]  Ronald A. Berk,et al.  Standard Setting: The Next Generation (Where Few Psychometricians Have Gone Before!) , 1996 .

[6]  Craig N. Mills A COMPARISON OF THREE METHODS OF ESTABLISHING CUT-OFF SCORES ON CRITERION-REFERENCED TESTS , 1983 .

[7]  Penny Pence,et al.  A Multi-Stage Dominant Profile Method for Setting Standards on Complex Performance Assessments. , 1995 .

[8]  Rebecca S. Lipner,et al.  A Comparison of Three Variations on a Standard-Setting Method. , 1987 .

[9]  Donald E. Powers,et al.  Logical Consistency of the Angoff Method of Standard Setting. , 1993 .

[10]  Stuart R. Kahl Using Actual Student Work To Determine Cut Scores for Proficiency Levels: New Methods for New Tests. , 1994 .

[11]  Ronald K. Hambleton,et al.  Using an Extended Angoff Procedure To Set Standards on Complex Performance Assessments. , 1995 .

[12]  D. Andrich A rating formulation for ordered response categories , 1978 .

[13]  I. C. McManus,et al.  An Empirical Examination of the Impact of Group Discussion and Examinee Performance Information on Judgments Made in the Angoff Standard-Setting Procedure , 2008 .

[14]  M. Kane Validating the Performance Standards Associated With Passing Scores , 1994 .

[15]  B. Wright,et al.  Best test design , 1979 .

[16]  Melissa J. Margolis,et al.  Judges' Use of Examinee Performance Data in an Angoff Standard‐Setting Exercise for a Medical Licensing Examination: An Experimental Study , 2009 .

[17]  G. Hurtz,et al.  A Meta-Analysis of the Effects of Modifications to the Angoff Method on Cutoff Scores and Judgment Consensus , 2003 .

[18]  Barry McGaw,et al.  Shaping Their Future: Recommendations for reform of the Higher School Certificate , 1997 .

[19]  Samuel A. Livingston,et al.  Passing Scores: A Manual for Setting Standards of Performance on Educational and Occupational Tests. , 1982 .

[20]  Ronald A. Berk,et al.  A Consumer’s Guide to Setting Performance Standards on Criterion-Referenced Tests , 1986 .

[21]  James C. Impara,et al.  A COMPARISON OF THREE METHODS FOR ESTABLISHING MINIMUM STANDARDS ON THE NATIONAL TEACHER EXAMINATIONS , 1984 .

[22]  Winfred Arthur,et al.  The Angoff Cutoff Score Method: The Impact of Frame-of-Reference Rater Training , 1991 .

[23]  David Andrich,et al.  Differential Subject Performance and the Problems of Selection , 1995 .

[24]  David B. Swanson,et al.  Multivariate Generalizability Analysis of the Impact of Training and Examinee Performance Information on Judgments Made in an Angoff‐Style Standard‐Setting Procedure , 2002 .

[25]  R. Jaeger Setting Performance Standards Through Two-Stage Judgmental Policy Capturing , 1995 .

[26]  Jeffrey K. Smith,et al.  Differential Use of Item Information by Judges Using Angoff and Nedeisky Procedures , 1988 .

[27]  Georg Rasch,et al.  Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[28]  C. Mills,et al.  Defining Minimal Competence , 1991 .