Anchor point selection: An approach for anchoring without anchor items

For detecting differential item functioning (DIF) between two groups of test takers, their item parameters need to be aligned in some way. Typically this is done by means of choosing a small number of so called anchor items. Here we propose an alternative strategy: the selection of an anchor point along the item parameter continuum, where the two groups best overlap. We illustrate how the anchor point is selected by means of maximizing an inequality criterion. It performs equally well or better than established approaches when treated as an anchoring technique, but also provides additional information about the DIF structure through its search path. Another distinct property of this new method is that no individual items are flagged as anchors. This is a major difference to traditional anchoring approaches, where flagging items as anchors implies - but does not guarantee - that they are DIF free, and may lull the user into a false sense of security. Our method can be viewed as a generalization of the search space of traditional anchor selection techniques and can shed new light on the practical usage as well as on the theoretical discussion on anchoring and DIF in general.

[1]  Gregory L. Candell,et al.  An Iterative Procedure for Linking Metrics and Assessing Item Bias in Item Response Theory , 1988 .

[2]  J. Teresi,et al.  Methodological Issues in Measuring Subjective Well-Being and Quality-of-Life: Applications to Assessment of Affect in Older, Chronically and Cognitively Impaired, Ethnically Diverse Groups Using the Feeling Tone Questionnaire , 2017, Applied Research in Quality of Life.

[3]  Curt Hagquist,et al.  Real and Artificial Differential Item Functioning , 2012 .

[5]  Wen-Chung Wang,et al.  Differential Item Functioning Detection Using the Multiple Indicators, Multiple Causes Method with a Pure Short Anchor , 2009 .

[6]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[7]  M. Geiger,et al.  The role of correlation in two-asset games: Some experimental evidence , 2017 .

[8]  Allan S. Cohen,et al.  An Investigation of the Likelihood Ratio Test For Detection of Differential Item Functioning , 1996 .

[9]  N. Verhelst,et al.  Loss of Information in Estimating Item Parameters in Incomplete Designs , 2006, Psychometrika.

[10]  Lionel Page,et al.  Can a Common Currency Foster a Shared Social Identity across Different Nations? The Case of the Euro , 2017 .

[11]  Carol M. Woods Empirical Selection of Anchors for Tests of Differential Item Functioning , 2009 .

[12]  Daniel M. Bolt,et al.  Addressing Score Bias and Differential Item Functioning Due to Individual Differences in Response Style , 2009 .

[13]  Claus H. Carstensen,et al.  Do Individual Response Styles Matter?: Assessing Differential Item Functioning for Men and Women in the NEO-PI-R , 2013 .

[15]  Susanne Pech,et al.  The effect of statutory sick-pay on workers' labor supply and subsequent health , 2017 .

[16]  I. W. Molenaar,et al.  Rasch models: foundations, recent developments and applications , 1995 .

[17]  Achim Zeileis,et al.  Anchor Selection Strategies for DIF Analysis , 2015, Educational and psychological measurement.

[18]  M. Sutter,et al.  How Uncertainty and Ambiguity in Tournaments Affect Gender Differences in Competitive Behavior , 2017, European Economic Review.

[19]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[20]  Nikolaus Umlauf,et al.  A primer on Bayesian distributional regression , 2018 .

[21]  Christine E. DeMars Type I Error Inflation for Detecting DIF in the Presence of Impact , 2010 .

[22]  M. Halla,et al.  Economic Origins of Cultural Norms: The Case of Animal Husbandry and Bastardy , 2017, European Economic Review.

[23]  J. Tendeiro,et al.  Investigating Measurement Invariance in Computer-Based Personality Testing , 2015, Educational and psychological measurement.

[24]  Gunter Maris,et al.  A Statistical Test for Differential Item Pair Functioning , 2015, Psychometrika.

[25]  M. Halla,et al.  Parental Leave, (In)formal Childcare, and Long-Term Child Outcomes , 2017, The Journal of Human Resources.

[26]  S. Lang,et al.  Selective mortality and undernutrition in low- and middle-income countries , 2017 .

[27]  Simon Czermak,et al.  Incentives for Dishonesty: An Experimental Study with Internal Auditors , 2017, Economic Inquiry.

[28]  S. Renes,et al.  Fairness views and political preferences - Evidence from a large online experiment , 2017 .

[29]  Nikolaus Umlauf,et al.  Nonlinear association structures in flexible Bayesian additive joint models , 2017, Statistics in medicine.

[30]  C. Gini Variabilita e Mutabilita. , 1913 .

[31]  M. Halla,et al.  The Intergenerational Causal Effect of Tax Evasion: Evidence from the Commuter Tax Allowance in Austria , 2017, Journal of the European Economic Association.

[32]  Christopher Kah,et al.  rs – G E P GEP 2018 – 01 Pairwise stable matching in large economies , 2018 .

[33]  P. De Boeck,et al.  Identification of Differential Item Functioning in Multiple-Group Settings: A Multivariate Outlier Detection Approach , 2011, Multivariate behavioral research.

[34]  Terry A. Ackerman A Didactic Explanation of Item Bias, Item Impact, and Item Validity from a Multidimensional Perspective , 1992 .

[35]  Rudolf Kerschbamer,et al.  Do altruists lie less? , 2017, Journal of Economic Behavior & Organization.

[36]  J. Teresi,et al.  Methodological Issues in Examining Measurement Equivalence in Patient Reported Outcomes Measures: Methods Overview to the Two-Part Series, "Measurement Equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Short Forms". , 2016, Psychological test and assessment modeling.

[37]  Achim Zeileis,et al.  BAMLSS: Bayesian Additive Models for Location, Scale, and Shape (and Beyond) , 2018, Journal of Computational and Graphical Statistics.

[38]  Duc Tran Huy,et al.  The acceptance of a protected area and the benefits of sustainable tourism: In search of the weak link in their relationship , 2017 .

[39]  Achim Zeileis,et al.  Measuring Inequality, Concentration, and Poverty , 2014 .

[40]  Florian Lindner Choking under pressure of top performers: Evidence from Biathlon competitions , 2017 .

[41]  Cees A. W. Glas,et al.  Testing the Rasch Model , 1995 .

[42]  H. V. D. Flier,et al.  AN ITERATIVE ITEM BIAS DETECTION METHOD , 1984 .

[43]  A. Zeileis,et al.  A Framework for Anchor Methods and an Iterative Forward Approach for DIF Detection , 2015, Applied psychological measurement.

[44]  Helena Fornwagner Incentives to lose revisited: The NHL and its tournament incentives , 2019 .

[45]  Achim Zeileis,et al.  Probabilistic Nowcasting of Low-Visibility Procedure States at Vienna International Airport During Cold Season , 2019, Pure and Applied Geophysics.

[46]  Stuart Parkes,et al.  Allgemeinbildung in Deutschland: Erkenntnisse aus dem SPIEGEL-Studentenpisa-Test , 2012 .

[47]  R. Kerschbamer,et al.  Social preferences and political attitudes: An online experiment on a large heterogeneous sample , 2020 .

[48]  Wen-Chung Wang,et al.  The DIF-Free-Then-DIF Strategy for the Assessment of Differential Item Functioning , 2012 .

[49]  S. Lang,et al.  Modelling Under-Five Mortality through Multilevel Structured Additive Regression with Varying Coefficients for Asia and Sub-Saharan Africa , 2020, The Journal of Development Studies.

[50]  Wen-Chung Wang,et al.  Effects of Anchor Item Methods on the Detection of Differential Item Functioning Within the Family of Rasch Models , 2004 .

[51]  Henk Kelderman,et al.  Examining differential item functioning due to item difficulty and alternative attractiveness , 1992 .

[52]  Achim Zeileis,et al.  Various versatile variances : An object-oriented implementation of clustered covariances in R Working , 2017 .

[53]  Georg Rasch,et al.  Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.