Reporting Diagnostic Scores: Temptations, Pitfalls, and Some Solutions

Diagnostic scores are of increasing interest due to their potential remedial and instructional benefit. Naturally, the number of testing programs that report diagnostic scores is on the rise, as are the number of research works on such scores. This paper starts by showing examples of diagnostic subscores reported by operational testing programs. Then this paper provides a discussion of existing psychometric methods for reporting diagnostic scores, followed by a brief review of a method proposed by Haberman (2008) that examines if subscores (that are the simplest form of diagnostic scores and are reported by several testing programs) have added value over the total score. Using results from several operational and simulated data sets, it is demonstrated that it is not straightforward to have diagnostic scores with added value. Some recommendations are made for those interested to report diagnostic scores.

[1]  Louis A. Roussos,et al.  The fusion model skills diagnosis system , 2007 .

[2]  Gautam Puhan,et al.  COMPARISON OF SUBSCORES BASED ON CLASSICAL TEST THEORY METHODS , 2008 .

[3]  Nancy L. Allen,et al.  Interpreting scales through scale anchoring. , 1992 .

[4]  Matthias von Davier,et al.  A general diagnostic model applied to language testing data. , 2008, The British journal of mathematical and statistical psychology.

[5]  Xiangdong Yang,et al.  Multicomponent latent trait models for complex tasks. , 2006, Journal of applied measurement.

[6]  Jonathan Templin,et al.  Using Efficient Model Based Sum‐Scores for Conducting Skills Diagnoses , 2007 .

[7]  Sandip Sinharay,et al.  When Can Subscores Be Expected to Have Added Value? Results from Operational and Simulated Data. Research Report. ETS RR-10-16. , 2010 .

[8]  Shelby J. Haberman,et al.  Reporting of Subscores Using Multidimensional Item Response Theory , 2010 .

[9]  Jeffrey A Douglas,et al.  Higher-order latent trait models for cognitive diagnosis , 2004 .

[10]  Rebecca Zwick,et al.  An Investigation of Alternative Methods for Item Mapping in the National Assessment of Educational Progress , 2005 .

[11]  Mark J. Gierl,et al.  The Attribute Hierarchy Method for Cognitive Assessment: A Variation on Tatsuoka's Rule-Space Approach , 2004 .

[12]  J. Templin,et al.  Unique Characteristics of Diagnostic Classification Models: A Comprehensive Review of the Current State-of-the-Art , 2008 .

[13]  B. Junker,et al.  Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory , 2001 .

[14]  M. Reckase The Past and Future of Multidimensional Item Response Theory , 1997 .

[15]  Russell G. Almond,et al.  Modeling Diagnostic Assessments with Bayesian Networks , 2007 .

[16]  Wendy M. Yen A Bayesian/IRT Index of Objective Performance 1 , 1987 .

[17]  J. Bermúdez,et al.  Personality Psychology in Europe , 1997 .

[18]  Shelby J. Haberman,et al.  How Can Multivariate Item Response Theory Be Used in Reporting of Susbcores? Research Report. ETS RR-10-09. , 2010 .

[19]  Matthias von Davier,et al.  COMPARISON OF MULTIDIMENSIONAL ITEM RESPONSE MODELS: MULTIVARIATE NORMAL ABILITY DISTRIBUTIONS VERSUS MULTIVARIATE POLYTOMOUS ABILITY DISTRIBUTIONS , 2008 .

[20]  Anne Thissen-Roe,et al.  The DIAGNOSER project: Combining assessment and learning , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[21]  K. Tatsuoka RULE SPACE: AN APPROACH FOR DEALING WITH MISCONCEPTIONS BASED ON ITEM RESPONSE THEORY , 1983 .

[22]  Gautam Puhan,et al.  Reporting Subscores: A Survey , 2010 .

[23]  Louis V. DiBello,et al.  31A Review of Cognitively Diagnostic Assessment and a Summary of Psychometric Models , 2006 .

[24]  Howard Wainer,et al.  Augmented Scores-"Borrowing Strength" to Compute Scores Based on Small Numbers ofltems , 2001 .

[25]  Gautam Puhan,et al.  Reporting subscores for institutions. , 2009, The British journal of mathematical and statistical psychology.

[26]  R. Glaser,et al.  Knowing What Students Know: The Science and Design of Educational Assessment , 2001 .

[27]  Shelby J. Haberman SUBSCORES AND VALIDITY , 2008 .