Item-Bundle DIF Hypothesis Testing: Identifying Suspect Bundles and Assessing Their Differential Functioning

This article proposes two multidimensional IRT model-based methods of selecting item bundles (clusters of not necessarily adjacent items chosen according to some organizational principle) suspected of displaying DIF amplification. The approach embodied in these two methods is inspired by Shealy and Stout's (1993a, 1993b) multidimensional model for DIF. Each bundle selected by these methods constitutes a DIF amplification hypothesis. When SIBTEST (Shealy & Stout, 1993b) confirms DIF amplification in selected bundles, differential bundle functioning (DBF) is said to occur. Three real data examples illustrate the two methods for suspect bundle selection. The effectiveness of the methods is argued on statistical grounds. A distinction between benign and adverse DIF is made. The decision whether flagged DIF items or DBF bundles display benign or adverse DIF/DBF must depend in part on nonstatistical construct validity arguments. Conducting DBF analyses using these methods should help in the identification of the causes of DIF/DBF.

[1]  Brian Habing,et al.  Conditional Covariance-Based Nonparametric Multidimensionality Assessment , 1996 .

[2]  W. Stout,et al.  An Item Response Theory Model for Test Bias. , 1991 .

[3]  H. Wainer,et al.  Differential Testlet Functioning: Definitions and Detection , 1991 .

[4]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[5]  Louis L. McQuitty,et al.  Hierarchical Linkage Analysis for the Isolation of Types , 1960 .

[6]  N. Dorans Two New Approaches to Assessing Differential Item Functioning: Standardization and the Mantel--Haenszel Method , 1989 .

[7]  William Stout,et al.  A nonparametric approach for assessing latent trait unidimensionality , 1987 .

[8]  Wendy M. Yen,et al.  Effects of Local Item Dependence on the Fit and Equating Performance of the Three-Parameter Logistic Model , 1984 .

[9]  Ratna Nandakumar,et al.  Refinements of Stout’s Procedure for Assessing Latent Trait Unidimensionality , 1993 .

[10]  Neil J. Dorans,et al.  THE STANDARDIZATION APPROACH TO ASSESSING DIFFERENTIAL SPEEDEDNESS , 1988 .

[11]  De Champlain,et al.  Assessing the Effect of Multidimensionality on IRT True-Score Equating for Subgroups of Examinees. , 1995 .

[12]  Ratna Nandakumar,et al.  Simultaneous DIF Amplification and Cancellation: Shealy-Stout's Test for DIF , 1993 .

[13]  Patricia A. Scott,et al.  Content Effects on Word Problem Performance: A Possible Source of Test Bias? , 1991 .

[14]  Kathleen A. O'Neill,et al.  Item and test characteristics that are associated with differential item functioning. , 1993 .

[15]  H. Wainer,et al.  Differential Item Functioning. , 1994 .

[16]  P. Holland,et al.  EVALUATING HYPOTHESES ABOUT DIFFERENTIAL ITEM FUNCTIONING1,2 , 1992 .

[17]  H. Swaminathan,et al.  An Assessment of Stout's Index of Essential Unidimensionality , 1996 .

[18]  Timothy R. Miller,et al.  Cluster Analysis of Angular Data in Applications of Multidimensional Item-Response Theory , 1992 .

[19]  William Stout,et al.  A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF , 1993 .