Anchor Selection Strategies for DIF Analysis

Differential item functioning (DIF) indicates the violation of the invariance assumption, for instance, in models based on item response theory (IRT). For item-wise DIF analysis using IRT, a common metric for the item parameters of the groups that are to be compared (e.g., for the reference and the focal group) is necessary. In the Rasch model, therefore, the same linear restriction is imposed in both groups. Items in the restriction are termed the ``anchor items''. Ideally, these items are DIF-free to avoid artificially augmented false alarm rates. However, the question how DIF-free anchor items are selected appropriately is still a major challenge. Furthermore, various authors point out the lack of new anchor selection strategies and the lack of a comprehensive study especially for dichotomous IRT models. This article reviews existing anchor selection strategies that do not require any knowledge prior to DIF analysis, offers a straightforward notation, and proposes three new anchor selection strategies. An extensive simulation study is conducted to compare the performance of the anchor selection strategies. The results show that an appropriate anchor selection is crucial for suitable item-wise DIF analysis. The newly suggested anchor selection strategies outperform the existing strategies and can reliably locate a suitable anchor when the sample sizes are large enough.

[1]  F. Drasgow,et al.  Lord's Chi-Square Test of Item Bias With Estimated and With Known Person Parameters , 1987 .

[2]  Gerhard H. Fischer,et al.  Derivations of the Rasch Model , 1995 .

[3]  Cees A. W. Glas,et al.  Testing the Rasch Model , 1995 .

[4]  Wen-Chung Wang,et al.  The DIF-Free-Then-DIF Strategy for the Assessment of Differential Item Functioning , 2012 .

[5]  O. Chernyshenko,et al.  The Effects of Referent Item Parameters on Differential Item Functioning Detection Using the Free Baseline Likelihood Ratio Test , 2009 .

[6]  Achim Zeileis,et al.  Anchor methods for DIF detection: A comparison of the iterative forward, backward, constant and all-other anchor class , 2013 .

[7]  Li Cai,et al.  The Langer-Improved Wald Test for DIF Testing With Multiple Groups , 2013 .

[8]  Fritz Drasgow,et al.  Study of the measurement bias of two standardized psychological tests. , 1987 .

[9]  Wen-Chung Wang,et al.  Effects of Average Signed Area Between Two Item Characteristic Curves and Test Purification Procedures on the DIF Detection via the Mantel-Haenszel Method , 2004 .

[10]  H. Swaminathan,et al.  A Comparison of Logistic Regression and Mantel-Haenszel Procedures for Detecting Differential Item Functioning , 1993 .

[11]  Wen-Chung Wang,et al.  Differential Item Functioning Detection Using the Multiple Indicators, Multiple Causes Method with a Pure Short Anchor , 2009 .

[12]  H. Swaminathan,et al.  Detecting Differential Item Functioning Using Logistic Regression Procedures , 1990 .

[13]  Ivo W. Molenaar,et al.  Estimation of Item Parameters , 1995 .

[14]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[15]  Two-Stage Equating in Differential Item Functioning Detection under the Graded Response Model with the Raju Area Measures and the Lord Statistic , 2002 .

[16]  N. Verhelst,et al.  Loss of Information in Estimating Item Parameters in Incomplete Designs , 2006, Psychometrika.

[17]  Fritz Drasgow,et al.  Detecting differential item functioning with confirmatory factor analysis and item response theory: toward a unified strategy. , 2006, The Journal of applied psychology.

[18]  Wen-Chung Wang,et al.  Effects of Anchor Item Methods on Differential Item Functioning Detection with the Likelihood Ratio Test , 2003 .

[19]  R. G. Lim,et al.  Evaluation of Two Methods for Estimating Item Response Theory Parameters When Assessing Differential Item Functioning , 1990 .

[20]  Nambury S. Raju,et al.  The area between two item characteristic curves , 1988 .

[21]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[22]  Howard Wainer,et al.  Use of item response theory in the study of group differences in trace lines. , 1988 .

[23]  Ronald K. Hambleton,et al.  Identifying the causes of DIF in translated verbal items , 1999 .

[24]  Insu Paek,et al.  IRTPRO 2.1 for Windows (Item Response Theory for Patient-Reported Outcomes) , 2013 .

[25]  F. J. Abad,et al.  The Effects of Purification and the Evaluation of Differential Item Functioning With the Likelihood Ratio Test , 2012 .

[26]  David Magis,et al.  RIM: A random item mixture model to detect Differential Item Functioning , 2010 .

[27]  Allan S. Cohen,et al.  An Investigation of the Likelihood Ratio Test For Detection of Differential Item Functioning , 1996 .

[28]  Holmes Finch,et al.  The MIMIC Model as a Method for Detecting DIF: Comparison With Mantel-Haenszel, SIBTEST, and the IRT Likelihood Ratio , 2005 .

[29]  Howard T. Everson,et al.  Methodology Review: Statistical Approaches for Assessing Measurement Bias , 1993 .

[30]  P. De Boeck,et al.  Identification of Differential Item Functioning in Multiple-Group Settings: A Multivariate Outlier Detection Approach , 2011, Multivariate behavioral research.

[31]  Mark J. Gierl,et al.  Evaluating Type I Error and Power Rates Using an Effect Size Measure With the Logistic Regression Procedure for DIF Detection , 2001 .

[32]  Gregory L. Candell,et al.  An Iterative Procedure for Linking Metrics and Assessing Item Bias in Item Response Theory , 1988 .

[33]  Allan S. Cohen,et al.  Detection of Differential Item Functioning Under the Graded Response Model With the Likelihood Ratio Test , 1998 .

[34]  Wen-Chung Wang,et al.  Effects of Anchor Item Methods on the Detection of Differential Item Functioning Within the Family of Rasch Models , 2004 .

[35]  Carol M. Woods Empirical Selection of Anchors for Tests of Differential Item Functioning , 2009 .

[36]  Randall D. Penfield Assessing Differential Item Functioning Among Multiple Groups: A Comparison of Three Mantel-Haenszel Procedures , 2001 .