The analysis of measurement equivalence in international studies using the Rasch model

When comparing data derived from tests or questionnaires in cross-national studies, researchers commonly assume measurement invariance in their underlying scaling models. However, different cultural contexts, languages, and curricula can have powerful effects on how students respond in different countries. This article illustrates how the application of the Rasch item response theory (IRT) model (Rasch, 1960) can be used for assessing differences in measurement properties of tests and questionnaires with reference to examples from the field trial analyses for the International Association for the Evaluation of Educational Achievement (IEA) International Civic and Citizenship Education Study (ICCS). It also discusses the general scope and limitations of the analyses undertaken in the context of this study.

[1]  Jürgen Rost,et al.  A logistic mixture distribution model for polychotomous item responses , 1991 .

[2]  Norbert K Tanzer,et al.  Bias and equivalence , 2000 .

[3]  D. R. Lehman,et al.  What's wrong with cross-cultural comparisons of subjective Likert scales?: The reference-group effect. , 2002, Journal of personality and social psychology.

[4]  J. Fraillon,et al.  ICCS 2009 Technical Report , 2011 .

[5]  J. Rost,et al.  Applications of Latent Trait and Latent Class Models in the Social Sciences , 1998 .

[6]  J. Fraillon,et al.  ICCS 2009 International Report: Civic knowledge, attitudes and engagement among lower secondary school students in thirty-eight countries. , 2010 .

[7]  Georg Rasch,et al.  Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[8]  Wolfram Schulz Questionnaire Construct Validation in the International Civic and Citizenship Education Study , 2008 .

[9]  R. Hambleton,et al.  Fundamentals of Item Response Theory , 1991 .

[10]  Barbara M Byrne,et al.  Measurement equivalence: a comparison of methods based on confirmatory factor analysis and item response theory. , 2002, The Journal of applied psychology.

[11]  R. Hambleton,et al.  Handbook of Modern Item Response Theory , 1997 .

[12]  A. Grisay,et al.  Equivalence of item difficulties across national versions of the PIRLS and PISA reading assessements , 2009 .

[13]  Jürgen Rost,et al.  A Conditional Item-Fit Index for Rasch Models , 1994 .

[14]  Wolfram Schulz,et al.  International civic and citizenship education study : assessment framework , 2008 .

[15]  Item Analysis and Review , 2001 .

[16]  G. Masters,et al.  Rating Scale Analysis. Rasch Measurement. , 1983 .

[17]  Wolfram Schulz,et al.  Citizenship and education in twenty-eight countries : civic knowledge at age fourteen , 2001 .

[18]  Suzanne Jak,et al.  Measurement bias and multidimensionality; an illustration of bias detection in multidimensional measurement models , 2010 .

[19]  Jean-Paul Fox,et al.  Multilevel IRT model assessment , 2005 .

[20]  Wolfram Schulz,et al.  Initial Findings from the IEA International Civic and Citizenship Education Study , 2010 .

[21]  Wolfram Schulz,et al.  IEA Civic Education Study: Technical Report , 2004 .

[22]  M. Walker Ameliorating culturally based extreme response tendencies to attitude items. , 2007, Journal of applied measurement.

[23]  Klaas Sijtsma,et al.  New Developments in Categorical Data Analysis for the Social and Behavioral Sciences , 2005 .

[24]  R. Hambleton,et al.  Item Bias Review , 1994 .

[25]  IEA civic education study , 2002 .