Psychometrics: From practice to theory and back

The paper surveys 15 years of progress in three psychometric research areas: latent dimensionality structure, test fairness, and skills diagnosis of educational tests. It is proposed that one effective model for selecting and carrying out research is to chose one's research questions from practical challenges facing educational testing, then bring to bear sophisticated probability modeling and statistical analyses to solve these questions, and finally to make effectiveness of the research answers in meeting the educational testing challenges be the ultimate criterion for judging the value of the research. The problem-solving power and the joy of working with a dedicated, focused, and collegial group of colleagues is emphasized. Finally, it is suggested that the summative assessment testing paradigm that has driven test measurement research for over half a century is giving way to a new paradigm that in addition embraces skills level formative assessment, opening up a plethora of challenging, exciting, and societally important research problems for psychometricians.

[1]  R. Maruyama,et al.  On Test Scoring , 1927 .

[2]  L. Tucker,et al.  Evaluation of factor analytic research procedures by means of simulated correlation matrices , 1969 .

[3]  G. H. Fischer,et al.  The linear logistic test model as an instrument in educational research , 1973 .

[4]  Robert J. Mokken,et al.  A Theory and Procedure of Scale Analysis. , 1973 .

[5]  Shelby J. Haberman,et al.  Maximum Likelihood Estimates in Exponential Response Models , 1977 .

[6]  Susan E. Whitely,et al.  Multicomponent latent trait models for ability tests , 1980 .

[7]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[8]  Patrick Suppes,et al.  When are Probabilistic Explanations Possible , 1981 .

[9]  S. Embretson,et al.  Component Latent Trait Models for Test Design. , 1982 .

[10]  R. Sternberg Beyond IQ: A Triarchic Theory of Human Intelligence , 1984 .

[11]  S. Embretson A general latent trait model for response processes , 1984 .

[12]  S. Embretson Test design : developments in psychology and psychometrics , 1985 .

[13]  P. Rosenbaum,et al.  Conditional Association and Unidimensionality in Monotone Latent Variable Models , 1985 .

[14]  Neil J. Dorans,et al.  Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude Test. , 1986 .

[15]  R. Sternberg A Triarchic Theory of Human Intelligence , 1986 .

[16]  Dorothy T. Thayer,et al.  Differential Item Performance and the Mantel-Haenszel Procedure. , 1986 .

[17]  K. Tatsuoka Toward an Integration of Item-Response Theory and Cognitive Error Diagnosis. , 1987 .

[18]  William Stout,et al.  A nonparametric approach for assessing latent trait unidimensionality , 1987 .

[19]  F. Kok,et al.  Item Bias and Test Multidimensionality , 1988 .

[20]  Edward H. Haertel Using restricted latent class models to map the skill structure of achievement items , 1989 .

[21]  R. Shealy An item response theory-based statistical procedure for detecting concurrent internal bias in ability tests , 1989 .

[22]  P. Holland On the sampling theory roundations of item response theory models , 1990 .

[23]  William F. Strout A new item response theory modeling approach with applications to unidimensionality assessment and ability estimation , 1990 .

[24]  Paul W. Holland,et al.  The Dutch Identity: A New Tool for the Study of Item Response Models. , 1990 .

[25]  Terry A. Ackerman A Didactic Explanation of Item Bias, Item Impact, and Item Validity from a Multidimensional Perspective , 1992 .

[26]  W. H. Angoff,et al.  Perspectives on differential item functioning methodology. , 1993 .

[27]  Kathleen A. O'Neill,et al.  Item and test characteristics that are associated with differential item functioning. , 1993 .

[28]  Hua-Hua Chang,et al.  The asymptotic posterior normality of the latent trait in an IRT model , 1993 .

[29]  William Stout,et al.  A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF , 1993 .

[30]  Ratna Nandakumar,et al.  Refinements of Stout’s Procedure for Assessing Latent Trait Unidimensionality , 1993 .

[31]  B. Junker Conditional association, essential independence and monotone unidimensional Item response models , 1993 .

[32]  Ratna Nandakumar,et al.  Simultaneous DIF Amplification and Cancellation: Shealy-Stout's Test for DIF , 1993 .

[33]  R. Mislevy Evidence and inference in educational assessment , 1994 .

[34]  Kikumi K. Tatsuoka,et al.  Architecture of knowledge structures and cognitive diagnosis: A statistical pattern recognition and classification approach. , 1995 .

[35]  E. Maris Psychometric latent response models , 1995 .

[36]  Brian Habing,et al.  Conditional Covariance-Based Nonparametric Multidimensionality Assessment , 1996 .

[37]  Louis V. DiBello,et al.  A Kernel-Smoothed Version of SIBTEST With Applications to Local DIF Inference and Function Estimation , 1996 .

[38]  William Stout,et al.  A Multidimensionality-Based DIF Analysis Paradigm , 1996 .

[39]  Jeffrey A Douglas,et al.  Item-Bundle DIF Hypothesis Testing: Identifying Suspect Bundles and Assessing Their Differential Functioning , 1996 .

[40]  William Stout,et al.  Simulation Studies of the Effects of Small Sample Size and Studied Item Parameters on SIBTEST and Mantel‐Haenszel Type I Error Performance , 1996 .

[41]  W. Stout,et al.  A new procedure for detection of crossing DIF , 1996 .

[42]  Hua-Hua Chang,et al.  Detecting DIF for Polytomously Scored Items: An Adaptation of the SIBTEST Procedure , 1995 .

[43]  Brian W. Junker,et al.  Tail-measurability in monotone latent variable models , 1997 .

[44]  Ratna Nandakumar,et al.  MULTISIB: A Procedure to Investigate DIF When a Test is Intentionally Two-Dimensional , 1997 .

[45]  B. Junker,et al.  A characterization of monotone unidimensional latent variable models , 1997 .

[46]  J. Douglas Joint consistency of nonparametric item characteristic curve and ability estimation , 1997 .

[47]  W. Stout,et al.  Improved Type I Error Control and Reduced Estimation Bias for DIF Detection Using SIBTEST , 1998 .

[48]  Jean-Claude Falmagne,et al.  Knowledge spaces , 1998 .

[49]  William Stout,et al.  Using New Proximity Measures With Hierarchical Cluster Analysis to Detect Multidimensionality , 1998 .

[50]  Furong Gao,et al.  Investigating Local Dependence With Conditional Covariance Functions , 1998 .

[51]  Klaas Sijtsma,et al.  Methodology Review: Nonparametric IRT Approaches to the Analysis of Dichotomous Item Scores , 1998 .

[52]  William Stout,et al.  The theoretical detect index of dimensionality and its application to approximate simple structure , 1999 .

[53]  B. Junker Some statistical models and computational methods that may be useful for cognitively-relevant assessment , 1999 .

[54]  William Stout,et al.  Conditional covariance structure of generalized compensatory multidimensional items , 1999 .

[55]  L. Roussos,et al.  A Generalized Formula for the Mantel-Haenszel Differential Item Functioning Parameter , 1999 .

[56]  Russell G. Almond,et al.  Bayes Nets in Educational Assessment: Where the Numbers Come From , 1999, UAI.

[57]  Anne Boomsma,et al.  Essays on Item Response Theory , 2000 .

[58]  Russell G. Almond,et al.  Bayes Nets in Educational Assessment: Where Do the Numbers Come from? CSE Technical Report. , 2000 .

[59]  B. Junker,et al.  Nonparametric Item Response Theory in Action: An Overview of the Special Issue , 2001 .

[60]  R. Glaser,et al.  Knowing What Students Know: The Science and Design of Educational Assessment , 2001 .

[61]  Brian Habing,et al.  Nonparametric Regression and the Parametric Bootstrap for Local Dependence Assessment , 2001 .

[62]  Jeffrey A Douglas,et al.  Asymptotic identifiability of nonparametric item response models , 2001 .

[63]  Mark J. Gierl,et al.  Identifying Sources of Differential Item and Bundle Functioning on Translated Achievement Tests: A Confirmatory Analysis , 2001 .

[64]  Furong Gao,et al.  Using Resampling Methods to Produce an Improved DIMTEST Procedure , 2001 .

[65]  Jeffrey Douglas,et al.  Nonparametric Item Response Function Estimation for Assessing Parametric Model Fit , 2001 .

[66]  Sarah M. Hartz,et al.  A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality. , 2002 .

[67]  James O. Ramsay,et al.  Nonparametric Item Response Function Estimates with the EM Algorithm , 2002 .

[68]  Mark J. Gierl,et al.  Identifying Content and Cognitive Skills that Produce Gender Differences in Mathematics: A Demonstration of the Multidimensionality‐Based DIF Analysis Paradigm , 2003 .