A Rationale for Defining Achievement Levels Using IRT-Estimated Domain Scores

A new procedure for defining achievement levels on continuous scales was developed using aspects of Guttman scaling and item response theory. This procedure assigns examinees to levels of achievement when the levels are represented by separate pools of multiple-choice items. Items were assigned to levels on the basis of their content and hierarchically defined level descriptions. The resulting level response functions were well-spaced and noncrossing. This result allowed well-spaced levels of achievement to be defined by a common percent-correct standard of mastery on the level pools. Guttman patterns of mastery could be inferred from level scores. The new scoring procedure was found to have higher reliability, higher classification consistency, and lower classification error, when compared to two Guttman scoring procedures.

[1]  S. Stouffer,et al.  Measurement and Prediction , 1954 .

[2]  Robert J. Mislevy,et al.  BILOG 3 : item analysis and test scoring with binary logistic models , 1990 .

[3]  S. Katz,et al.  A Measure of Primary Sociobiological Functions , 1976, International journal of health services : planning, administration, evaluation.

[4]  G. Masters,et al.  Mapping student achievement , 1994 .

[5]  M. Lund,et al.  The Combat Exposure Scale: a systematic assessment of trauma in the Vietnam War. , 1984, Journal of clinical psychology.

[6]  Robert J. Mislevy,et al.  Implications of Market-Basket Reporting for Achievement-Level Setting , 1998 .

[7]  W. A. Nicewander,et al.  Reliability and Information Functions for Percentile Ranks , 1994 .

[8]  Allen L. Edwards,et al.  Techniques Of Attitude Scale Construction , 1958 .

[9]  R. Forsyth Do NAEP Scales Yield Valid Criterion‐Referenced Interpretations? , 1991 .

[10]  Estimating average domain scores , 1998 .

[11]  Ronald A. Berk,et al.  A Guide to Criterion-Referenced Test Construction , 1984 .

[12]  R. Darrell Bock,et al.  IRT Estimation of Domain Scores , 1997 .

[13]  H. Huynh Error Rates in Competency Testing When Test Retaking Is Permitted , 1990 .

[14]  D. Andrich An Elaboration of Guttman Scaling with Rasch Models for Measurement , 1985 .

[15]  Michael J. Kolen,et al.  A Study of Modified-Guttman and IRT-Based Level Scoring Procedures for Work Keys Assessments. ACT Research Report Series 97-7. , 1997 .

[16]  B. Wright,et al.  Best test design , 1979 .

[17]  Rob R. Meijer,et al.  The Number of Guttman Errors as a Simple and Powerful Person-Fit Statistic , 1994 .

[18]  Jill Englebright Fox,et al.  Young children's development of swinging behaviors , 1995 .

[19]  Eugene G. Johnson The NAEP 1992 technical report , 1994 .

[20]  Karen Draney,et al.  Objective measurement : theory into practice , 1992 .

[21]  Barbara S. Plake,et al.  Teachers' Ability to Estimate Item Difficulty: A Test of the Assumptions in the Angoff Standard Setting Method , 1998 .

[22]  M. Wilson A Comparison of Deterministic and Probabilistic Approaches to Measuring Learning Structures , 1989 .

[23]  M. J. Kolen,et al.  Conditional Standard Errors of Measurement for Scale Scores Using IRT , 1996 .

[24]  R. Brennan,et al.  Test equating : methods and practices , 1995 .

[25]  Frederic M. Lord,et al.  Comparison of IRT True-Score and Equipercentile Observed-Score "Equatings" , 1984 .