Multi-indicator Water Time Series Imputation with Autoregressive Generative Adversarial Networks

The water quality data has missing values and lacks integrity because water environment monitoring equipments are easily damaged by environmental influences, thereby affecting the analysis accuracy of downstream tasks. Traditional data imputation methods include mea/Mast filling, K-nearest neighbor, matrix factorization, Lahrangian interpolation, etc., do not consider time dependence or fail to use complex relations among multiple features. Inspired by successful applications of various variants of Generative Adversarial Networks (GANs) on time series data, this work proposes a time series data imputation method called GEDA, which integrates -GAN, an Encoder-Decoder structure, and an Autoregressive network. GEDA adopts GAN to learn the probability distribution of multi-feature time series, and imputes the missing values with the generated data. Then, GEDA combines feature extraction of the encoder-decoder structure, and time dependence capturing of the autoregressive network. Real-life dataset-based experimental results demonstrate GEDA outperforms several state-of-the-art data imputation methods in terms of accuracy.

[1]  MengChu Zhou,et al.  Large-scale water quality prediction with integrated deep neural network , 2021, Inf. Sci..

[2]  MengChu Zhou,et al.  An Improved Attention-based LSTM for Multi-Step Dissolved Oxygen Prediction in Water Environment , 2020, 2020 IEEE International Conference on Networking, Sensing and Control (ICNSC).

[3]  Stefan Poslad,et al.  Temporal Convolutional Networks for Multiperson Activity Recognition Using a 2-D LIDAR , 2020, IEEE Internet of Things Journal.

[4]  Haitao Yuan,et al.  An Integrated Deep Neural Network Approach for Large-Scale Water Quality Time Series Prediction , 2019, 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC).

[5]  Mihaela van der Schaar,et al.  GAIN: Missing Data Imputation using Generative Adversarial Nets , 2018, ICML.

[6]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[7]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[8]  Gunnar Rätsch,et al.  Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs , 2017, ArXiv.

[9]  Aaron C. Courville,et al.  Professor Forcing: A New Algorithm for Training Recurrent Networks , 2016, NIPS.

[10]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[12]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[13]  Aaron C. Courville,et al.  Generative Adversarial Nets , 2014, NIPS.

[14]  Shichao Zhang Nearest neighbor selection for iteratively kNN imputation , 2012, J. Syst. Softw..

[15]  Robert Tibshirani,et al.  Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..

[16]  Yi Zhang,et al.  PPCA-Based Missing Data Imputation for Traffic Flow Volume: A Systematical Approach , 2009, IEEE Transactions on Intelligent Transportation Systems.

[17]  Mihaela van der Schaar,et al.  Time-series Generative Adversarial Networks , 2019, NeurIPS.

[18]  Ying Zhang,et al.  Multivariate Time Series Imputation with Generative Adversarial Networks , 2018, NeurIPS.

[19]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..