EPEM: Efficient Parameter Estimation for Multiple Class Monotone Missing Data

The problem of monotone missing data has been broadly studied during the last two decades and has many applications in different fields such as bioinformatics or statistics. Commonly used imputation techniques require multiple iterations through the data before yielding convergence. Moreover, those approaches may introduce extra noises and biases to the subsequent modeling. In this work, we derive exact formulas and propose a novel algorithm to compute the maximum likelihood estimators (MLEs) of a multiple class, monotone missing dataset when all the covariance matrices of all categories are assumed to be equal, namely EPEM. We then illustrate an application of our proposed methods in Linear Discriminant Analysis (LDA). As the computation is exact, our EPEM algorithm does not require multiple iterations through the data as other imputation approaches, thus promising to handle much less time-consuming than other methods. This effectiveness was validated by empirical results when EPEM reduced the error rates significantly and required a short computation time compared to several imputation-based approaches. We also release all codes and data of our experiments in one GitHub repository to contribute to the research community related to this problem.

[1]  K. Mardia Measures of multivariate skewness and kurtosis with applications , 1970 .

[2]  Jieping Ye,et al.  Two-Dimensional Linear Discriminant Analysis , 2004, NIPS.

[3]  Eduardo R. Hruschka,et al.  Naive Bayes as an imputation tool for classification problems , 2005, Fifth International Conference on Hybrid Intelligent Systems (HIS'05).

[4]  Estevam R. Hruschka,et al.  Bayesian networks for imputation in classification problems , 2007, Journal of Intelligent Information Systems.

[5]  Roozbeh Razavi-Far,et al.  Imputation of missing data using fuzzy neighborhood density-based clustering , 2016, 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[6]  Ke Wang,et al.  MIDA: Multiple Imputation Using Denoising Autoencoders , 2017, PAKDD.

[7]  Alan Wee-Chung Liew,et al.  Missing value imputation for the analysis of incomplete traffic accident data , 2014, Inf. Sci..

[8]  R. Morales,et al.  Inductive learning models with missing values , 2006, Math. Comput. Model..

[9]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[10]  Patrick E. McKnight Missing Data: A Gentle Introduction , 2007 .

[11]  Tshilidzi Marwala,et al.  Autoencoder, Principal Component Analysis and Support Vector Regression for Data Imputation , 2007, ArXiv.

[12]  D. Rubin Multiple Imputation After 18+ Years , 1996 .

[13]  Miriam Seoane Santos,et al.  Missing Data Imputation via Denoising Autoencoders: The Untold Story , 2018, IDA.

[14]  Jayanthi Ranjan,et al.  Missing Value Imputation using Hybrid K-Means and Association Rules , 2018, 2018 International Conference on Advances in Computing, Communication Control and Networking (ICACCCN).

[15]  Gene H. Golub,et al.  Missing value estimation for DNA microarray gene expression data: local least squares imputation , 2005, Bioinform..

[16]  Md Zahidul Islam,et al.  Missing value imputation using decision trees and decision forests by splitting and merging records: Two novel techniques , 2013, Knowl. Based Syst..

[17]  Fazlollah Soleymani,et al.  An iterative method for computing the approximate inverse of a square matrix and the Moore-Penrose inverse of a non-square matrix , 2013, Appl. Math. Comput..

[18]  Lovedeep Gondara,et al.  Multiple Imputation Using Deep Denoising Autoencoders , 2017, ArXiv.

[19]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[20]  Stef van Buuren,et al.  MICE: Multivariate Imputation by Chained Equations in R , 2011 .

[21]  Jerome P. Reiter,et al.  Multiple imputation for missing data via sequential regression trees. , 2010, American journal of epidemiology.

[22]  Md Zahidul Islam,et al.  Missing value imputation using a fuzzy clustering-based EM approach , 2015, Knowledge and Information Systems.

[23]  Charles E. Heckler,et al.  Applied Multivariate Statistical Analysis , 2005, Technometrics.

[24]  Swati Aggarwal,et al.  DL-GSA: A Deep Learning Metaheuristic Approach to Missing Data Imputation , 2018, ICSI.

[25]  Kaare Brandt Petersen,et al.  The Matrix Cookbook , 2006 .

[26]  Simone Scardapane,et al.  Missing Data Imputation with Adversarially-trained Graph Convolutional Networks , 2019, Neural Networks.

[27]  Tero Aittokallio,et al.  Dealing with missing values in large-scale studies: microarray data imputation and beyond , 2010, Briefings Bioinform..

[28]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[29]  N. Jetté,et al.  The Incidence of Parkinson's Disease: A Systematic Review and Meta-Analysis , 2016, Neuroepidemiology.

[30]  P. Lambin,et al.  Predicting outcomes in radiation oncology—multifactorial decision support systems , 2013, Nature Reviews Clinical Oncology.

[31]  Igor Škrjanc,et al.  Incremental Missing-Data Imputation for Evolving Fuzzy Granular Prediction , 2020, IEEE Transactions on Fuzzy Systems.

[32]  H. Fujisawa A note on the maximum likelihood estimators for multivariate normal distribution with monotone data , 1995 .

[33]  Ahmet Arslan,et al.  A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm , 2013, Inf. Sci..

[34]  K. Krishnamoorthy,et al.  Two-sample inference for normal mean vectors based on monotone missing data , 2006 .

[35]  Tshilidzi Marwala,et al.  Missing Data Estimation in High-Dimensional Datasets: A Swarm Intelligence-Deep Neural Network Approach , 2016, ICSI.

[36]  Yoshua Bengio,et al.  Recurrent Neural Networks for Missing or Asynchronous Data , 1995, NIPS.

[37]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[38]  Igor Škrjanc,et al.  Evolvable fuzzy systems from data streams with missing values: With application to temporal pattern recognition and cryptocurrency prediction , 2019, Pattern Recognit. Lett..

[39]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[40]  Weiguo Li,et al.  A family of iterative methods for computing the approximate inverse of a square matrix and inner inverse of a non-square matrix , 2010, Appl. Math. Comput..

[41]  Peter Filzmoser,et al.  Iterative stepwise regression imputation using standard and robust methods , 2011, Comput. Stat. Data Anal..

[42]  Aníbal R. Figueiras-Vidal,et al.  Pattern classification with missing data: a review , 2010, Neural Computing and Applications.

[43]  Vadlamani Ravi,et al.  Data imputation via evolutionary computation, clustering and a neural network , 2015, Neurocomputing.

[44]  Y. Fujikoshi,et al.  Some basic properties of the MLE's for a multivariate normal distribution with monotone missing data , 1998 .

[45]  Tshilidzi Marwala,et al.  A dynamic programming approach to missing data estimation using neural networks , 2013, Inf. Sci..

[46]  Julie Josse,et al.  Multiple imputation for continuous variables using a Bayesian principal component analysis† , 2014, 1401.5747.

[47]  Xiaojie Yuan,et al.  Missing value imputation in multivariate time series with end-to-end generative adversarial networks , 2021, Inf. Sci..

[48]  Amaury Lendasse,et al.  Extreme learning machine for missing data using multiple imputations , 2016, Neurocomputing.

[49]  Robert L. Wolpert,et al.  Statistical Inference , 2019, Encyclopedia of Social Network Analysis and Mining.

[50]  Jacek Tabor,et al.  Processing of missing data by neural networks , 2018, NeurIPS.

[51]  Robert Tibshirani,et al.  Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..

[52]  T. W. Anderson Maximum Likelihood Estimates for a Multivariate Normal Distribution when Some Observations are Missing , 1957 .