Leveraging variational autoencoders for multiple data imputation

Missing data persists as a major barrier to data analysis across numerous applications. Recently, deep generative models have been used for imputation of missing data, motivated by their ability to capture highly non-linear and complex relationships in the data. In this work, we investigate the ability of deep models, namely variational autoencoders (VAEs), to account for uncertainty in missing data through multiple imputation strategies. We find that VAEs provide poor empirical coverage of missing data, with underestimation and overconfident imputations, particularly for more extreme missing data values. To overcome this, we employ β -VAEs, which viewed from a generalized Bayes framework, provide robustness to model misspecification. Assigning a good value of β is critical for uncertainty calibration and we demonstrate how this can be achieved using cross-validation. In downstream tasks, we show how multiple imputation with β -VAEs can avoid false discoveries that arise as artefacts of imputation.

[1]  Chao Ma,et al.  Identifiable Generative Models for Missing Not at Random Data Imputation , 2021, NeurIPS.

[2]  Jes Frellsen,et al.  not-MIWAE: Deep Generative Modelling with Missing not at Random Data , 2020, ICLR.

[3]  Tatiana Matejovicova,et al.  Accurate Imputation and Efficient Data Acquisition with Transformer-based VAEs , 2021 .

[4]  Olivier Gevaert,et al.  Genomic data imputation with variational auto-encoders , 2020, GigaScience.

[5]  Sebastian Tschiatschek,et al.  VAEM: a Deep Generative Model for Heterogeneous Mixed Type Data , 2020, NeurIPS.

[6]  Christopher K. I. Williams,et al.  VAEs in the Presence of Missing Data , 2020, ArXiv.

[7]  Pablo M. Olmos,et al.  Handling Incomplete Heterogeneous Data using VAEs , 2018, Pattern Recognit..

[8]  José Miguel Hernández-Lobato,et al.  Bayesian Variational Autoencoders for Unsupervised Out-of-Distribution Detection , 2019, ArXiv.

[9]  Diederik P. Kingma,et al.  An Introduction to Variational Autoencoders , 2019, Found. Trends Mach. Learn..

[10]  Jes Frellsen,et al.  MIWAE: Deep Generative Modelling and Imputation of Incomplete Data Sets , 2019, ICML.

[11]  Radu State,et al.  Improving Missing Data Imputation with Deep Generative Models , 2019, ArXiv.

[12]  Sebastian Nowozin,et al.  EDDI: Efficient Dynamic Discovery of High-Value Information with Partial VAE , 2018, ICML.

[13]  Roger B. Grosse,et al.  Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[14]  Jes Frellsen,et al.  Leveraging the Exact Likelihood of Deep Latent Variable Models , 2018, NeurIPS.

[15]  Jared S. Murray,et al.  Multiple Imputation: A Review of Practical and Theoretical Findings , 2018, 1801.04058.

[16]  Ke Wang,et al.  MIDA: Multiple Imputation Using Denoising Autoencoders , 2017, PAKDD.

[17]  José Miguel Hernández-Lobato,et al.  Partial VAE for Hybrid Recommender System , 2018 .

[18]  Kilian Q. Weinberger,et al.  Snapshot Ensembles: Train 1, get M for free , 2017, ICLR.

[19]  C. Holmes,et al.  Assigning a value to a power likelihood in a general Bayesian model , 2017, 1701.08515.

[20]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[21]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[22]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[23]  Pier Giovanni Bissiri,et al.  A general framework for updating belief distributions , 2013, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[24]  P. Diaconis,et al.  The sample size required in importance sampling , 2015, 1511.01437.

[25]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[27]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[28]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[29]  Noah D. Goodman,et al.  Amortized Inference in Probabilistic Reasoning , 2014, CogSci.

[30]  Aníbal R. Figueiras-Vidal,et al.  Pattern classification with missing data: a review , 2010, Neural Computing and Applications.

[31]  S. Mohamed,et al.  Missing data: A comparison of neural network and expectation maximization techniques , 2007, 0704.3474.

[32]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[33]  H. Stern,et al.  The use of multiple imputation for the analysis of missing data. , 2001, Psychological methods.

[34]  N. Hjort,et al.  On Bayesian consistency , 2001 .

[35]  L. Wasserman,et al.  The consistency of posterior distributions in nonparametric problems , 1999 .