Using plausible values in secondary analysis in large-scale assessments

ABSTRACT Plausible values are typically used in large-scale assessment studies, in particular, in the Trends in International Mathematics and Science Study and the Programme for International Student Assessment. Despite its large spread, there are still some questions regarding the use of plausible values and how such use affects statistical analyses. The aim of this paper is to demonstrate the role of plausible values in large-scale assessment surveys when multilevel modeling is used. Different user strategies concerning plausible values for multilevel models as well as means and variances are examined. The results show that some commonly used user strategies give incorrect results while others give reasonable estimates but incorrect standard errors. These findings are important for anyone wishing to make secondary analyses of large-scale assessment data, especially those interested in using multilevel models to analyze the data.

[1]  David E. Booth,et al.  Analysis of Incomplete Multivariate Data , 2000, Technometrics.

[2]  T. A. Warm Weighted likelihood estimation of ability in item response theory , 1989 .

[3]  J. Graham,et al.  How Many Imputations are Really Needed? Some Practical Clarifications of Multiple Imputation Theory , 2007, Prevention Science.

[4]  R. Hambleton,et al.  Item Response Theory , 1984, The History of Educational Measurement.

[5]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[6]  Todd E. Bodner,et al.  What Improves with Increased Missing Data Imputations? , 2008 .

[7]  Robert J. Mislevy,et al.  Randomization-based inference about latent variables from complex samples , 1991 .

[8]  M. Woodbury A missing information principle: theory and applications , 1972 .

[9]  Robert J. Mislevy,et al.  Title of Document: RANDOMIZATION-BASED INFERENCE ABOUT LATENT VARIABLES FROM COMPLEX SAMPLES: THE CASE OF TWO-STAGE SAMPLING , 2012 .

[10]  M. Wiberg,et al.  School effectiveness in mathematics in Sweden and Norway 2003, 2007 and 2011 , 2013 .

[11]  Christine E. DeMars,et al.  Item Response Theory , 2010, Assessing Measurement Invariance for Applied Research.

[12]  Megan Kuhfeld Multilevel Item Factor Analysis and Student Perceptions of Teacher Effectiveness. , 2016 .

[13]  Alexander Hehmeyer,et al.  Nonparametric Bayesian Multiple Imputation for Incomplete Categorical Variables in Large-Scale Assessment Surveys , 2013 .

[14]  Joseph Hilbe,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2009 .

[15]  R. D. Bock,et al.  Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm , 1981 .

[16]  J. Schafer,et al.  Missing data: our view of the state of the art. , 2002, Psychological methods.

[17]  R. Rosner Computer software , 1978, Nature.

[18]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[19]  B. Tabachnick,et al.  Using multivariate statistics, 5th ed. , 2007 .

[20]  Frederic M. Lord,et al.  Unbiased estimators of ability parameters, of their variance, and of their parallel-forms reliability , 1983 .

[21]  R. Ebstein,et al.  Parental Oxytocin and Early Caregiving Jointly Shape Children’s Oxytocin Response and Social Reciprocity , 2013, Neuropsychopharmacology.

[22]  Robert J. Mislevy,et al.  Estimating Population Characteristics From Sparse Matrix Samples of Item Responses , 1992 .

[23]  Christian Monseur,et al.  Plausible values: how to deal with their limitations. , 2009, Journal of applied measurement.

[24]  Diana Adler,et al.  Using Multivariate Statistics , 2016 .

[25]  Matthew S. Johnson,et al.  A BAYESIAN HIERARCHICAL MODEL FOR LARGE-SCALE EDUCATIONAL SURVEYS: AN APPLICATION TO THE NATIONAL ASSESSMENT OF EDUCATIONAL PROGRESS , 2004 .

[26]  Margaret Wu The Role of Plausible Values in Large-Scale Surveys. , 2005 .

[27]  Frank Rijmen,et al.  A General Psychometric Approach for Educational Survey Assessments: Flexible Statistical Models and Efficient Estimation Methods , 2013 .