Multivariate Time Series Imputation with Generative Adversarial Networks
Abstract:Multivariate time series usually contain a large number of missing values, which hinders the application of advanced analysis methods on multivariate time series data. Conventional approaches to addressing the challenge of missing values, including mean/zero imputation, case deletion, and matrix factorization-based imputation, are all incapable of modeling the temporal dependencies and the nature of complex distribution in multivariate time series. In this paper, we treat the problem of missing value imputation as data generation. Inspired by the success of Generative Adversarial Networks (GAN) in image generation, we propose to learn the overall distribution of a multivariate time series dataset with GAN, which is further used to generate the missing values for each sample. Different from the image data, the time series data are usually incomplete due to the nature of data recording process. A modified Gate Recurrent Unit is employed in GAN to model the temporal irregularity of the incomplete time series. Experiments on two multivariate time series datasets show that the proposed model outperformed the baselines in terms of accuracy of imputation. Experimental results also showed that a simple model on the imputed data can achieve state-of-the-art results on the prediction tasks, demonstrating the benefits of our model in downstream applications.
暂无分享,去 创建一个
[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[2] W. Wothke. Longitudinal and multigroup modeling with missing data. , 2000 .
[3] Gustavo E. A. P. A. Batista,et al. An analysis of four missing data treatment methods for supervised learning , 2003, Appl. Artif. Intell..
[4] D. Edwards. Data Mining: Concepts, Models, Methods, and Algorithms , 2003 .
[5] Edgar Acuña,et al. The Treatment of Missing Values and its Effect on Classifier Accuracy , 2004 .
[6] Tshilidzi Marwala,et al. Missing data: A comparison of neural network and expectation maximization techniques , 2007 .
[7] Patrick E. McKnight. Missing Data: A Gentle Introduction , 2007 .
[8] R. Perera. Research methods journal club: a gentle introduction to imputation of missing values , 2008, Evidence-based medicine.
[9] Aníbal R. Figueiras-Vidal,et al. Pattern classification with missing data: a review , 2010, Neural Computing and Applications.
[10] J. Graham,et al. Missing data analysis: making it work in the real world. , 2009, Annual review of psychology.
[11] Leslie S. Smith,et al. A neural network-based framework for the reconstruction of incomplete data sets , 2010, Neurocomputing.
[12] Robert Tibshirani,et al. Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..
[13] Wei-Chang Yeh,et al. Forecasting stock markets using wavelet transforms and recurrent neural networks: An integrated system based on artificial bee colony algorithm , 2011, Appl. Soft Comput..
[14] G. Moody,et al. Predicting in-hospital mortality of ICU patients: The PhysioNet/Computing in cardiology challenge 2012 , 2012, 2012 Computing in Cardiology.
[15] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[16] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[17] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.
[18] Luis E. Zárate,et al. A brief review of the main approaches for treatment of missing data , 2014, Intell. Data Anal..
[19] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[20] Jehanzeb R. Cheema. A Review of Missing Data Handling Methods in Education Research , 2014 .
[21] Gari D. Clifford,et al. Data preprocessing and mortality prediction: The Physionet/CinC 2012 challenge revisited , 2014, Computing in Cardiology 2014.
[22] Jiri Kaiser,et al. Dealing with Missing Values in Data , 2014 .
[23] Trevor J. Hastie,et al. Matrix completion and low-rank SVD via fast alternating least squares , 2014, J. Mach. Learn. Res..
[24] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[25] Pedro Abreu,et al. Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values , 2015, Comput. Biol. Medicine.
[26] Pieter Abbeel,et al. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.
[27] Mehran Amiri,et al. Missing data imputation using fuzzy-rough methods , 2016, Neurocomputing.
[28] J. Zico Kolter,et al. Gradient descent GAN optimization is locally stable , 2017, NIPS.
[29] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.
[30] Lantao Yu,et al. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.
[31] Zhenan Sun,et al. Recent Progress of Face Image Synthesis , 2017, 2017 4th IAPR Asian Conference on Pattern Recognition (ACPR).
[32] Yoshua Bengio,et al. Maximum-Likelihood Augmented Discrete Generative Adversarial Networks , 2017, ArXiv.
[33] Pinjia He,et al. Semantically Consistent Image Completion with Fine-grained Details , 2017, ArXiv.
[34] John E. Hopcroft,et al. Stacked Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.
[36] Ming-Hsuan Yang,et al. Generative Face Completion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Sandeep Subramanian,et al. Adversarial Generation of Natural Language , 2017, Rep4NLP@ACL.
[38] Eric Horvitz,et al. Predicting Mortality of Intensive Care Patients via Learning about Hazard , 2017, AAAI.
[39] Beng Chin Ooi,et al. Resolving the Bias in Electronic Medical Records , 2017, KDD.
[40] Christian Ledig,et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Kamalika Chaudhuri,et al. Approximation and Convergence Properties of Generative Adversarial Learning , 2017, NIPS.
[42] Georg Langs,et al. Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery , 2017, IPMI.
[43] Andrew M. Dai,et al. MaskGAN: Better Text Generation via Filling in the ______ , 2018, ICLR.
[44] Mihaela van der Schaar,et al. GAIN: Missing Data Imputation using Generative Adversarial Nets , 2018, ICML.
[45] Alexandros G. Dimakis,et al. AmbientGAN: Generative models from lossy measurements , 2018, ICLR.
[46] Yan Liu,et al. Recurrent Neural Networks for Multivariate Time Series with Missing Values , 2016, Scientific Reports.