A Didactic Presentation of Snijders’s lz* Index of Person Fit With Emphasis on Response Model Selection and Ability Estimation

This paper focuses on two likelihood-based indices of person fit, the index lz and the Snijders’s modified index lz *. The first one is commonly used in practical assessment of person fit, although its asymptotic standard normal distribution is not valid when true abilities are replaced by sample ability estimates. The lz * index is a generalization of lz , which corrects for this sampling variability. Surprisingly, it is not yet popular in the psychometric and educational assessment community. Moreover, there is some ambiguity about which type of item response model and ability estimation method can be used to compute the lz * index. The purpose of this article is to present the index lz * in a simple and didactic approach. Starting from the relationship between lz and lz *, we develop the framework according to the type of logistic item response theory (IRT) model and the likelihood-based estimators of ability. The practical calculation of lz * is illustrated by analyzing a real data set about language skill assessment.

[1]  Steven P. Reise,et al.  Scoring Method and the Detection of Person Misfit in a Personality Assessment Context , 1995 .

[2]  Fritz Drasgow,et al.  Appropriateness measurement with polychotomous item response models and standardized indices , 1984 .

[3]  Rob R. Meijer,et al.  A Comparison of the Person Response Function and the lz Person-Fit Statistic , 1998 .

[4]  W. Emons Detection and Diagnosis of Person Misfit From Patterns of Summed Polytomous Item Scores , 2009 .

[5]  Rob R. Meijer,et al.  The Null Distribution of Person-Fit Statistics for Conventional and Adaptive Tests , 1999 .

[6]  L. Cronbach Response Sets and Test Validity , 1946 .

[7]  Karl Christoph Klauer The Assessment of Person Fit , 1995 .

[8]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[9]  Rob R. Meijer,et al.  Trait Level Estimation for Nonfitting Response Vectors , 1997 .

[10]  George Karabatsos,et al.  Comparing the Aberrant Response Detection Performance of Thirty-Six Person-Fit Statistics , 2003 .

[11]  Stephen Olejnik,et al.  The Power of Rasch Person-Fit Statistics in Detecting Unusual Response Patterns , 1997 .

[12]  Herbert Hoijtink,et al.  Person-Fit and the Rasch Model, with an Application to Knowledge of Logical Quantors. , 1996 .

[13]  Melvin R. Novick,et al.  Some latent train models and their use in inferring an examinee's ability , 1966 .

[14]  Tom A. B. Snijders,et al.  Asymptotic null distribution of person fit statistics with estimated person parameter , 2001 .

[15]  G. Masters A rasch model for partial credit scoring , 1982 .

[16]  I. W. Molenaar,et al.  Rasch models: foundations, recent developments and applications , 1995 .

[17]  D. Andrich Rating Scale Analysis , 1999 .

[18]  Pere J. Ferrando,et al.  Person Reliability in Personality Measurement: An Item Response Theory Analysis , 2004 .

[19]  R. D. Bock,et al.  Adaptive EAP Estimation of Ability in a Microcomputer Environment , 1982 .

[20]  Klaas Sijtsma,et al.  Methodology Review: Evaluating Person Fit , 2001 .

[21]  Hamzeh Dodeen The Use of Person-Fit Statistics To Analyze Placement Tests. , 2003 .

[22]  Jimmy de la Torre,et al.  Improving Person-Fit Assessment by Correcting the Ability Estimate and Its Reference Distribution. , 2008 .

[23]  Kikumi Tatasuoka Use of Generalized Person-Fit Indexes, Zetas for Statistical Pattern Classification. , 1996 .

[24]  M. C. Jones,et al.  A reliable data-based bandwidth selection method for kernel density estimation , 1991 .

[25]  Herbert Hoijtink,et al.  The many null distributions of person fit indices , 1990 .

[26]  Kikumi K. Tatsuoka,et al.  Caution indices based on item response theory , 1984 .

[27]  Fritz Drasgow,et al.  Detecting Inappropriate Test Scores with Optimal and Practical Appropriateness Indices , 1987 .

[28]  S. Sheather Density Estimation , 2004 .

[29]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[30]  Ronald D. Armstrong,et al.  On the Performance of the l Z Person-Fit Statistic , 2007 .

[31]  Cornelis A.W. Glas,et al.  A Person Fit Test For Irt Models For Polytomous Items , 2007 .

[32]  L. M. M.-T. Theory of Probability , 1929, Nature.

[33]  Herbert Hoijtink,et al.  On person parameter estimation in the dichotomous Rasch model , 1995 .

[34]  Gilles Raîche,et al.  Characterization of the Distribution of the Lz Index of Person Fit According to the Estimated Proficiency Level. , 2005 .

[35]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[36]  Dimitrios Rizopoulos ltm: An R Package for Latent Variable Modeling and Item Response Theory Analyses , 2006 .

[37]  Donald B. Rubin,et al.  Measuring the Appropriateness of Multiple-Choice Test Scores , 1979 .

[38]  Dimitris Rizopoulos,et al.  ltm: An R Package for Latent Variable Modeling and Item Response Analysis , 2006 .

[39]  Fritz Drasgow,et al.  Appropriateness Measurement: Validating Studies and Variable Ability Models , 1983 .

[40]  Kikumi K. Tatsuoka,et al.  Indices for Detecting Unusual Patterns: Links Between Two General Approaches and Potential Applications , 1983 .

[41]  Michael V. Levine,et al.  Optimal appropriateness measurement , 1988 .

[42]  Michael L. Nering The Distribution of Person Fit Using True and Estimated Person Parameters , 1995 .

[43]  T. A. Warm Weighted likelihood estimation of ability in item response theory , 1989 .

[44]  F. Samejima Estimation of latent ability using a response pattern of graded scores , 1968 .

[45]  Michael V. LeVine,et al.  Appropriateness measurement: Review, critique and validating studies , 1982 .

[46]  B. Wright,et al.  Best Test Design. Rasch Measurement. , 1979 .

[47]  Michael L. Nering The Distribution of Indexes of Person Fit within the Computerized Adaptive Testing Environment , 1997 .

[48]  Fritz Drasgow,et al.  Detecting Faking on a Personality Instrument Using Appropriateness Measurement , 1996 .