DAEMA: Denoising Autoencoder with Mask Attention

Missing data is a recurrent and challenging problem, especially when using machine learning algorithms for real-world applications. For this reason, missing data imputation has become an active research area, in which recent deep learning approaches have achieved state-of-the-art results. We propose DAEMA (Denoising Autoencoder with Mask Attention), an algorithm based on a denoising autoencoder architecture with an attention mechanism. While most imputation algorithms use incomplete inputs as they would use complete data up to basic preprocessing (e.g. mean imputation) DAEMA leverages a maskbased attention mechanism to focus on the observed values of its inputs. We evaluate DAEMA both in terms of reconstruction capabilities and downstream prediction and show that it achieves superior performance to state-of-the-art algorithms on several publicly available realworld datasets under various missingness settings.

[1]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[2]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[3]  Pablo M. Olmos,et al.  Handling Incomplete Heterogeneous Data using VAEs , 2018, Pattern Recognit..

[4]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[5]  Simone Scardapane,et al.  Missing Data Imputation with Adversarially-trained Graph Convolutional Networks , 2019, Neural Networks.

[6]  Theodoros Rekatsinas,et al.  Attention-based Learning for Missing Data Imputation in HoloClean , 2020, MLSys.

[7]  Radu State,et al.  Improving Missing Data Imputation with Deep Generative Models , 2019, ArXiv.

[8]  Ao Li,et al.  Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme , 2006, BMC Bioinformatics.

[9]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[10]  Lidia Auret,et al.  Variational Autoencoders for Missing Data Imputation with Application to a Simulated Milling Circuit , 2018 .

[11]  Ke Wang,et al.  MIDA: Multiple Imputation Using Denoising Autoencoders , 2017, PAKDD.

[12]  Mihaela van der Schaar,et al.  GAIN: Missing Data Imputation using Generative Adversarial Nets , 2018, ICML.

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[15]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[16]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[17]  Dan Jackson,et al.  What Is Meant by "Missing at Random"? , 2013, 1306.2812.

[18]  Wencheng Wu,et al.  McFlow: Monte Carlo Flow Models for Data Imputation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Stef van Buuren,et al.  MICE: Multivariate Imputation by Chained Equations in R , 2011 .

[20]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[21]  Peter Bühlmann,et al.  MissForest - non-parametric missing value imputation for mixed-type data , 2011, Bioinform..