The Present and Future of IRT‐Based Cognitive Diagnostic Models (ICDMs) and Related Methods

As the goals of educational assessment evolve from the strictly evaluative to the di agnostically useful, so also evolve the statistical methods used to build, validate, and interpret educational tests. The methods discussed in this special issue all approach diagnosis in an item response theory (IRT) related way, with models that are parame terized at the item level and that extract information from individual item responses. Clearly, their most distinguishing feature is their more complex, multidimensional representation of examinee proficiency. This representation can be built directly into an item response model (as seen in most clearly in Almond, DiBello, Moulder, & Zapata-Rivera, 2007; Henson, Templin, & Douglas, 2007; Roussos, Templin, & Hen son, 2007; Stout, 2007) or else it can provide a framework for interpreting (residual) patterns in item responses (as is seen in Gierl, 2007). The complexity of the proficiency space introduces corresponding complexities into the statistical modeling and score reporting aspects of diagnosis. A high level of expert judgment is needed in formulating appropriate models. One of the pri mary challenges in implementing IRT-based cognitively diagnostic model (ICDMs) requires determining which aspects of the modeling process should be constrained through expert judgment and which can and should be informed by observed item re sponse data. The vast array of psychometric models now available for diagnosis and the different ways they handle these complexities (e.g., how many levels for each skill, how do skills interact, how does skill mastery translate to item performance, etc.) make model selection a central issue. At the same time, it can be challenging to compare models according to goodness of fit due to the many other aspects within each model that must be informed by experts (e.g., entries of the item-by-skill Q matrix, structure of the proficiency space, etc). Data-driven model re-specification is often messy. Collectively, the papers presented in this Special Issue provide a comprehensive overview of the state of the art in IRT-based diagnosis. While all emphasize a com mon end-goal of examinee diagnosis, the process by which this is achieved and the balance of data-driven and expert-driven decision making used along the way also introduce important differences.