Iterative Purification and Effect Size Use With Logistic Regression for Differential Item Functioning Detection

Two unresolved implementation issues with logistic regression (LR) for differential item functioning (DIF) detection include ability purification and effect size use. Purification is suggested to control inaccuracies in DIF detection as a result of DIF items in the ability estimate. Additionally, effect size use may be beneficial in controlling Type I error rates. The effectiveness of such controls, especially used in combination, requires evaluation. Detection errors were evaluated through simulation across iterative purification and no purification procedures with and without the use of an effect size criterion. Sample size, DIF magnitude and percentage, and ability differences were manipulated. Purification was beneficial under certain conditions, although overall power and Type I error rates did not substantially improve. The LR statistical test without purification performed as well as other classification criteria and may be the practical choice for many situations. Continued evaluation of the effect size guidelines and purification are discussed.

[1]  S. Maller WISC-III Verbal Item Invariance across Samples of Deaf and Hearing Children of Similar Measured Ability , 1996 .

[2]  H. V. D. Flier,et al.  AN ITERATIVE ITEM BIAS DETECTION METHOD , 1984 .

[3]  Dorothy T. Thayer,et al.  Differential Item Performance and the Mantel-Haenszel Procedure. , 1986 .

[4]  H. V. D. Flier,et al.  DETECTING EXPERIMENTALLY INDUCED ITEM BIAS USING THE ITERATIVE LOGIT METHOD , 1985 .

[5]  Bengt Muthen,et al.  Some uses of structural equation modeling in validity studies: Extending IRT to external variables , 1986 .

[6]  David Kaplan,et al.  The Sage handbook of quantitative methodology for the social sciences , 2004 .

[7]  G. Lautenschlager,et al.  Improving IRT Item Bias Detection With Iterative Linking and Ability Scale Purification , 1990 .

[8]  Gary L. Marco,et al.  Item characteristic curve solutions to three intractable testing problems. , 1977 .

[9]  G. Glass,et al.  Statistical methods in education and psychology, 3rd ed. , 1996 .

[10]  T. Haladyna,et al.  Construct-Irrelevant Variance in High-Stakes Testing. , 2005 .

[11]  Terry A. Ackerman A Didactic Explanation of Item Bias, Item Impact, and Item Validity from a Multidimensional Perspective , 1992 .

[12]  T. C. Oshima,et al.  Effect of Sample Size, Number of Biased Items, and Magnitude of Bias on a Two-Stage Item Bias Estimation Method , 1992 .

[13]  Brian E. Clauser,et al.  Using Statistical Procedures to Identify Differentially Functioning Test Items , 2005 .

[14]  B. Zumbo A Handbook on the Theory and Methods of Differential Item Functioning (DIF) LOGISTIC REGRESSION MODELING AS A UNITARY FRAMEWORK FOR BINARY AND LIKERT-TYPE (ORDINAL) ITEM SCORES , 1999 .

[15]  J. López-Pina,et al.  Differential Item Functioning Detection and Effect Size: A Comparison between Logistic Regression and Mantel-Haenszel Procedures , 2004 .

[16]  Brian E. Clauser,et al.  Using logistic regression and the Mantel-Haenszel with multiple ability estimates to detect differential item functioning. , 1995 .

[17]  H. Swaminathan,et al.  Detecting Differential Item Functioning Using Logistic Regression Procedures , 1990 .

[18]  Seock-Ho Kim,et al.  A Comparison of Lord's χ2 and Raju's Area Measures In Detection of DIF , 1993 .

[19]  G. J. Mellenbergh,et al.  Effects of Amount of DIF, Test Length, and Purification Type on Robustness and Power of Mantel-Haenszel Procedures , 2000 .

[20]  B. Reed,et al.  Differential item functioning in the Mini-Mental State Examination in English- and Spanish-speaking older adults. , 1997, Psychology and aging.

[21]  William Stout,et al.  Simulation Studies of the Effects of Small Sample Size and Studied Item Parameters on SIBTEST and Mantel‐Haenszel Type I Error Performance , 1996 .

[22]  Howard T. Everson,et al.  Methodology Review: Statistical Approaches for Assessing Measurement Bias , 1993 .

[23]  L. Shepard,et al.  Methods for Identifying Biased Test Items , 1994 .

[24]  David Thissen,et al.  Beyond group-mean differences: The concept of item bias. , 1986 .

[25]  Nambury S. Raju,et al.  The area between two item characteristic curves , 1988 .

[26]  S. Menard Applied Logistic Regression Analysis , 1996 .

[27]  Mark J. Gierl,et al.  Evaluating Type I Error and Power Rates Using an Effect Size Measure With the Logistic Regression Procedure for DIF Detection , 2001 .

[28]  William Stout,et al.  A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF , 1993 .

[29]  R. Hambleton,et al.  The Effects of Purification of Matching Criterion on the Identification of DIF Using the Mantel-Haenszel Procedure , 1993 .

[30]  Identifiers California,et al.  Annual Meeting of the National Council on Measurement in Education , 1998 .

[31]  Saifuddin Azwar DIFFERENTIAL ITEM FUNCTIONING ANALYSIS, , 2009 .

[32]  Gregory L. Candell,et al.  An Iterative Procedure for Linking Metrics and Assessing Item Bias in Item Response Theory , 1988 .

[33]  H. Wainer,et al.  Differential Item Functioning. , 1994 .

[34]  Hans C. Jessen,et al.  Applied Logistic Regression Analysis , 1996 .

[35]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[36]  Howard Wainer,et al.  Use of item response theory in the study of group differences in trace lines. , 1988 .

[37]  Michael R. Harwell,et al.  Monte Carlo Studies in Item Response Theory , 1996 .

[38]  J. Spray,et al.  Performance of the Mantel-Haenszel Statistic and the Standardized Difference in Proportions Correct When Population Ability Distributions Are Incongruent. , 1992 .

[39]  Seock-Ho Kim,et al.  Effects of Linking Methods on Detection of DIF , 1992 .

[40]  G. Glass,et al.  Statistical methods in education and psychology , 1970 .

[41]  Susan J. Maller,et al.  Differential Item Functioning in the Wisc-III: Item Parameters for Boys and Girls in the National Standardization Sample , 2001 .

[42]  H. Swaminathan,et al.  Identification of Items that Show Nonuniform DIF , 1996 .

[43]  Hariharan Swaminathan,et al.  Performance of the Mantel-Haenszel and Simultaneous Item Bias Procedures for Detecting Differential Item Functioning , 1993 .

[44]  Mikyung Kim Detecting DIF across the different language groups in a speaking test , 2001 .

[45]  H. Swaminathan,et al.  A Comparison of Logistic Regression and Mantel-Haenszel Procedures for Detecting Differential Item Functioning , 1993 .

[46]  Juana Gómez-Benito,et al.  Effects of Ability Scale Purification on the Identification of dif , 2002 .