A Review of Missing Data Handling Methods in Education Research

Missing data are a common occurrence in survey-based research studies in education, and the way missing values are handled can significantly affect the results of analyses based on such data. Despite known problems with performance of some missing data handling methods, such as mean imputation, many researchers in education continue to use those methods as a quick fix. This study reviews the current literature on missing data handling methods within the special context of education research to summarize the pros and cons of various methods and provides guidelines for future research in this area.

[1]  S. S. Wilks Moments and Distributions of Estimates of Population Parameters from Fragmentary Samples , 1932 .

[2]  George L. Edgett Multiple Regression with Missing Observations Among the Independent Variables , 1956 .

[3]  S. F. Buck A Method of Estimation of Missing Values in Multivariate Data Suitable for Use with an Electronic Computer , 1960 .

[4]  R. Elashoff,et al.  Missing Observations in Multivariate Statistics I. Review of the Literature , 1966 .

[5]  Y. Haitovsky Missing Data in Regression Analysis , 1968 .

[6]  J. Gurland,et al.  A Simple Approximation for Unbiased Estimation of the Standard Deviation , 1971 .

[7]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[8]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[9]  Ingram Olkin,et al.  Incomplete data in sample surveys. Vol. 3: proceedings of the symposium , 1983 .

[10]  Ingram Olkin,et al.  Incomplete data in sample surveys. Vol. 2: theory and bibliographies , 1983 .

[11]  Ingram Olkin,et al.  Incomplete data in sample surveys. Vol. 1: report and case studies , 1983 .

[12]  Ingram Olkin,et al.  Incomplete data in sample surveys , 1985 .

[13]  Mark R. Raymond,et al.  A Comparison of Methods for Treating Incomplete Data in Selection Research , 1987 .

[14]  R. Little A Test of Missing Completely at Random for Multivariate Data with Missing Values , 1988 .

[15]  P. Roth MISSING DATA: A CONCEPTUAL REVIEW FOR APPLIED PSYCHOLOGISTS , 1994 .

[16]  Michael P. Jones Indicator and stratification methods for missing explanatory variables in multiple linear regression , 1996 .

[17]  D P MacKinnon,et al.  Maximizing the Usefulness of Data Obtained with Planned Missing Value Patterns: An Application of Maximum Likelihood Procedures. , 1996, Multivariate behavioral research.

[18]  G. Kalton,et al.  Handling missing data in survey research , 1996, Statistical methods in medical research.

[19]  Elizabeth Wilkinson,et al.  The Task Force on Statistical Inference , 1999 .

[20]  Leland Wilkinson,et al.  Statistical Methods in Psychology Journals Guidelines and Explanations , 2005 .

[21]  Robert M. Groves,et al.  Survey Nonresponse , 2002 .

[22]  Xiaobo Zhou,et al.  Missing-value estimation using linear and non-linear regression with Bayesian gene selection , 2003, Bioinform..

[23]  Erno Lehtinen,et al.  Difficulties Experienced by Education and Sociology Students in Quantitative Methods Courses , 2003 .

[24]  John L.P. Thompson,et al.  Missing data , 2004, Amyotrophic lateral sclerosis and other motor neuron disorders : official publication of the World Federation of Neurology, Research Group on Motor Neuron Diseases.

[25]  Craig K. Enders,et al.  Missing Data in Educational Research: A Review of Reporting Practices and Suggestions for Improvement , 2004 .

[26]  Jae Kwang Kim Finite sample properties of multiple imputation estimators , 2004, math/0406453.

[27]  Ian R White,et al.  Are missing outcome data adequately handled? A review of published randomized controlled trials in major medical journals , 2004, Clinical trials.

[28]  A. Acock Working With Missing Values , 2005 .

[29]  Lakhmi Jain,et al.  Computational Economics: A Perspective from Computational Intelligence , 2006 .

[30]  Todd E Bodner,et al.  Missing Data: Prevalence and Reporting Practices , 2006, Psychological reports.

[31]  C. Y. Peng,et al.  Advances in Missing Data Methods and Implications for Educational Research , 2006 .

[32]  Neil Salkind Encyclopedia of Measurement and Statistics , 2006 .

[33]  Patrick E. McKnight Missing Data: A Gentle Introduction , 2007 .

[34]  Richard M Lerner,et al.  Use of missing data methods in longitudinal studies: the persistence of bad practices in developmental psychology. , 2009, Developmental psychology.

[35]  Mohamed Alosh The Impact of Missing Data in a Generalized Integer-Valued Autoregression Model for Count Data , 2009, Journal of biopharmaceutical statistics.

[36]  A Rogier T Donders,et al.  Unpredictable bias when using the missing indicator method or complete case analysis for missing confounder values: an empirical example. , 2010, Journal of clinical epidemiology.

[37]  Roderick J A Little,et al.  A Review of Hot Deck Imputation for Survey Non‐response , 2010, International statistical review = Revue internationale de statistique.

[38]  William A. Young,et al.  A survey of methodologies for the treatment of missing values within datasets: limitations and benefits , 2011 .

[40]  Yilmaz Kaya,et al.  AN APPLICATION OF HOT DECK IMPUTATION AND SUBSTITUTION METHODS IN THE ESTIMATION OF MISSING DATA , 2011 .

[41]  Tzung-Pei Hong,et al.  Mining rules from an incomplete dataset with a high missing rate , 2011, Expert Syst. Appl..

[42]  Lena Osterhagen,et al.  Multiple Imputation For Nonresponse In Surveys , 2016 .

[43]  Sabrina Eberhart,et al.  Applied Missing Data Analysis , 2016 .