论文信息 - Variational Selective Autoencoder: Learning from Partially-Observed Heterogeneous Data

Variational Selective Autoencoder: Learning from Partially-Observed Heterogeneous Data

Learning from heterogeneous data poses challenges such as combining data from various sources and of different types. Meanwhile, heterogeneous data are often associated with missingness in real-world applications due to heterogeneity and noise of input sources. In this work, we propose the variational selective autoencoder (VSAE), a general framework to learn representations from partiallyobserved heterogeneous data. VSAE learns the latent dependencies in heterogeneous data by modeling the joint distribution of observed data, unobserved data, and the imputation mask which represents how the data are missing. It results in a unified model for various downstream tasks including data generation and imputation. Evaluation on both low-dimensional and high-dimensional heterogeneous datasets for these two tasks shows improvement over state-of-the-art models.

[1] Zhiting Hu,et al. Improved Variational Autoencoders for Text Modeling using Dilated Convolutions , 2017, ICML.

[2] Constantine Frangakis,et al. Multiple imputation by chained equations: what is it and how does it work? , 2011, International journal of methods in psychiatric research.

[3] Mihaela van der Schaar,et al. GAIN: Missing Data Imputation using Generative Adversarial Nets , 2018, ICML.

[4] Ke Wang,et al. MIDA: Multiple Imputation Using Denoising Autoencoders , 2017, PAKDD.

[5] Nicole A. Lazar,et al. Statistical Analysis With Missing Data , 2003, Technometrics.

[6] Mike Wu,et al. Multimodal Generative Models for Scalable Weakly-Supervised Learning , 2018, NeurIPS.

[7] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[8] Honglak Lee,et al. Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[9] Ruslan Salakhutdinov,et al. Learning Factorized Multimodal Representations , 2018, ICLR.

[10] Martial Hebert,et al. Cross-Stitch Networks for Multi-task Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] 拓海杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[12] Louis-Philippe Morency,et al. Variational Auto-Decoder: Neural Generative Modeling from Partial Data , 2019 .

[13] Michael I. Jordan,et al. Supervised learning from incomplete data via an EM approach , 1993, NIPS.

[14] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[15] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[16] Bo Jiang,et al. MisGAN: Learning from Incomplete Data with Generative Adversarial Networks , 2019, ICLR.

[17] Stef van Buuren,et al. MICE: Multivariate Imputation by Chained Equations in R , 2011 .

[18] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[19] Yu Zhang,et al. Learning to Multitask , 2018, NeurIPS.

[20] Dmitry Vetrov,et al. Variational Autoencoder with Arbitrary Conditioning , 2018, ICLR.

[21] Louis-Philippe Morency,et al. Multimodal Machine Learning: A Survey and Taxonomy , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Sebastian Nowozin,et al. EDDI: Efficient Dynamic Discovery of High-Value Information with Partial VAE , 2018, ICML.