Item Purification Does Not Always Improve DIF Detection

Item purification is an iterative process that is often advocated as improving the identification of items affected by differential item functioning (DIF). With test-score-based DIF detection methods, item purification iteratively removes the items currently flagged as DIF from the test scores to get purified sets of items, unaffected by DIF. The purpose of this article is to highlight that item purification is not always useful and that a single run of the DIF method may return equally suitable results. Angoff’s Delta plot is considered as a counterexample DIF method, with a recent improvement to the derivation of the classification threshold. Several possible item purification processes may be defined with this method, and all of them are compared through a simulation study and a real data set analysis. It appears that none of these purification processes clearly improves the Delta plot performance. A tentative explanation is drawn from the conceptual difference between the modified Delta plot and the other traditional DIF methods.

[1]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[2]  David Magis,et al.  On the difficulty of relational concepts among participants with Down syndrome. , 2012, Research in developmental disabilities.

[3]  Dorothy T. Thayer,et al.  Differential Item Performance and the Mantel-Haenszel Procedure. , 1986 .

[4]  R. Nungester,et al.  Analysis of Differential Item Functioning (DIF) Using Hierarchical Logistic Regression Models , 2002 .

[5]  Gregory L. Candell,et al.  An Iterative Procedure for Linking Metrics and Assessing Item Bias in Item Response Theory , 1988 .

[6]  G. J. Mellenbergh,et al.  Effects of Amount of DIF, Test Length, and Purification Type on Robustness and Power of Mantel-Haenszel Procedures , 2000 .

[7]  Allan L Reiss,et al.  Gene, brain, and behavior relationships in fragile X syndrome: evidence from neuroimaging studies. , 2009, Developmental disabilities research reviews.

[8]  Neil J. Dorans,et al.  Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude Test. , 1986 .

[9]  J. Muñiz,et al.  Utility of the Mantel-Haenszel Procedure for Detecting Differential Item Functioning in Small Samples , 2004 .

[10]  R. Zwick,et al.  Assessment of Differential Item Functioning for Performance Tasks , 1993 .

[11]  Nancy L. Allen,et al.  “THIN” VERSUS “THICK” MATCHING IN THE MANTEL-HAENSZEL PROCEDURE FOR DETECTING DIF , 1992 .

[12]  R. Hambleton,et al.  The Effects of Purification of Matching Criterion on the Identification of DIF Using the Mantel-Haenszel Procedure , 1993 .

[13]  David Magis,et al.  A Robust Outlier Approach to Prevent Type I Error Inflation in Differential Item Functioning , 2012 .

[14]  W. H. Angoff,et al.  ITEM-RACE INTERACTION ON A TEST OF SCHOLASTIC APTITUDE , 1971 .

[15]  Michalis P. Michaelides Sensitivity of Equated Aggregate Scores to the Treatment of Misbehaving Common Items , 2010 .

[16]  Paul De Boeck,et al.  Random Item IRT Models , 2008 .

[17]  Nambury S. Raju,et al.  Determining the Significance of Estimated Signed and Unsigned Areas Between Two Item Response Functions , 1990 .

[18]  W. H. Angoff,et al.  ITEM‐RACE INTERACTION ON A TEST OF SCHOLASTIC APTITUDE1 , 1973 .

[19]  Randall D. Penfield Applying the Breslow-Day Test of Trend in Odds Ratio Heterogeneity to the Analysis of Nonuniform DIF , 2003 .

[20]  Ronald K. Hambleton,et al.  Small Sample Studies to Detect Flaws in Item Translations , 2001 .

[21]  J. Belmont,et al.  Beyond matching on the mean in developmental disabilities research. , 2011, Research in developmental disabilities.

[22]  David Magis,et al.  Angoff's delta method revisited: improving DIF detection under small samples. , 2012, The British journal of mathematical and statistical psychology.

[23]  David Magis,et al.  RIM: A random item mixture model to detect Differential Item Functioning , 2010 .

[24]  P. De Boeck,et al.  Identification of Differential Item Functioning in Multiple-Group Settings: A Multivariate Outlier Detection Approach , 2011, Multivariate behavioral research.

[25]  Akihito Kamata,et al.  Item Analysis by the Hierarchical Generalized Linear Model. , 2001 .

[26]  William H. Angoff A Technique for the Investigation of Cultural Differences. , 1972 .

[27]  Bruno Facon,et al.  An item analysis of Raven's Colored Progressive Matrices among participants with Down syndrome. , 2010, Research in developmental disabilities.

[28]  Allan S. Cohen,et al.  A Mixture Model Analysis of Differential Item Functioning , 2005 .

[29]  Lawrence M. Rudner Using Standard Tests with the Hearing Impaired: The Problem of Item Bias. , 1978 .

[30]  Ronald K. Hambleton,et al.  Evaluating the Equivalence of Different Language Versions of a Credentialing Exam , 2003 .

[31]  Steven J. Osterlind,et al.  Test item bias , 1983 .

[32]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[33]  Howard Wainer,et al.  Use of item response theory in the study of group differences in trace lines. , 1988 .

[34]  C. Mervis,et al.  Cognitive and behavioral characteristics of children with Williams syndrome: Implications for intervention approaches , 2010, American journal of medical genetics. Part C, Seminars in medical genetics.

[35]  Wen-Chung Wang,et al.  Effects of Average Signed Area Between Two Item Characteristic Curves and Test Purification Procedures on the DIF Detection via the Mantel-Haenszel Method , 2004 .

[36]  Lawrence M. Rudner Efforts Toward the Development of Unbiased Selection and Assessment Instruments. , 1977 .