Attention-Based Sequence-to-Sequence Model for Time Series Imputation

Time series data are usually characterized by having missing values, high dimensionality, and large data volume. To solve the problem of high-dimensional time series with missing values, this paper proposes an attention-based sequence-to-sequence model to imputation missing values in time series (ASSM), which is a sequence-to-sequence model based on the combination of feature learning and data computation. The model consists of two parts, encoder and decoder. The encoder part is a BIGRU recurrent neural network and incorporates a self-attentive mechanism to make the model more capable of handling long-range time series; The decoder part is a GRU recurrent neural network and incorporates a cross-attentive mechanism into associate with the encoder part. The relationship weights between the generated sequences in the decoder part and the known sequences in the encoder part are calculated to achieve the purpose of focusing on the sequences with a high degree of correlation. In this paper, we conduct comparison experiments with four evaluation metrics and six models on four real datasets. The experimental results show that the model proposed in this paper outperforms the six comparative missing value interpolation algorithms.

[1]  Wei Wu,et al.  For-backward LSTM-based missing data reconstruction for time-series Landsat images , 2022, GIScience & Remote Sensing.

[2]  Yong Deng,et al.  Natural visibility encoding for time series and its application in stock trend prediction , 2021, Knowl. Based Syst..

[3]  Chen Wang,et al.  Time Series Data Imputation: A Survey on Deep Learning Approaches , 2020, ArXiv.

[4]  Cho-Jui Hsieh,et al.  Learning to Encode Position for Transformer with Continuous Dynamical Model , 2020, ICML.

[5]  K. Thangavel,et al.  Missing value imputation using unsupervised machine learning techniques , 2019, Soft Computing.

[6]  Jaeyoon Kim,et al.  A Survey of Missing Data Imputation Using Generative Adversarial Networks , 2020, 2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC).

[7]  Kejun Wang,et al.  Photovoltaic power forecasting based LSTM-Convolutional Network , 2019 .

[8]  Qing Cai,et al.  Multi-source sequential knowledge regression by using transfer RNN units , 2019, Neural Networks.

[9]  K. Thangavel,et al.  A Novel Fuzzy Rough Clustering Parameter-based missing value imputation , 2019, Neural Computing and Applications.

[10]  Yuhao Wang,et al.  A deep learning algorithm using a fully connected sparse autoencoder neural network for landslide susceptibility prediction , 2019, Landslides.

[11]  Nammee Moon,et al.  BiLSTM model based on multivariate time series data in multiple field for forecasting trading area , 2019, Journal of Ambient Intelligence and Humanized Computing.

[12]  Fang Liu,et al.  Air Pollution Forecasting Using a Deep Learning Model Based on 1D Convnets and Bidirectional GRU , 2019, IEEE Access.

[13]  Yi-Fan Zhang,et al.  SSIM—A Deep Learning Approach for Recovering Missing Time Series Sensor Data , 2018, IEEE Internet of Things Journal.

[14]  Ping-Huan Kuo,et al.  A Deep CNN-LSTM Model for Particulate Matter (PM2.5) Forecasting in Smart Cities , 2018, Sensors.

[15]  Mihaela van der Schaar,et al.  GAIN: Missing Data Imputation using Generative Adversarial Nets , 2018, ICML.

[16]  Yingfeng Cai,et al.  Ensemble correlation-based low-rank matrix completion with applications to traffic data imputation , 2017, Knowl. Based Syst..

[17]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[18]  Li Li,et al.  Using LSTM and GRU neural network methods for traffic flow prediction , 2016, 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC).

[19]  Chung-Ho Su,et al.  A hybrid fuzzy time series model based on ANFIS and integrated nonlinear feature selection method for forecasting stock , 2016, Neurocomputing.

[20]  Thambipillai Srikanthan,et al.  A clustering-based approach for data-driven imputation of missing traffic data , 2016, 2016 IEEE Forum on Integrated and Sustainable Transportation Systems (FISTS).

[21]  Coral Barbas,et al.  Missing value imputation strategies for metabolomics data , 2015, Electrophoresis.

[22]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[23]  Ivan G. Costa,et al.  Impact of missing data imputation methods on gene expression clustering and classification , 2015, BMC Bioinformatics.

[24]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[25]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[26]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[27]  Wei-Sheng Wu,et al.  Missing value imputation for microarray data: a comprehensive comparison study and a web tool , 2013, BMC Systems Biology.

[28]  Wan-Chi Siu,et al.  Iterative bicluster-based least square framework for estimation of missing values in microarray gene expression data , 2012, Pattern Recognit..

[29]  Alessandro G. Di Nuovo,et al.  Missing data analysis with fuzzy C-Means: A study of its application in a psychological scenario , 2011, Expert Syst. Appl..

[30]  Leonardo Franco,et al.  Missing data imputation using statistical and machine learning methods in a real breast cancer problem , 2010, Artif. Intell. Medicine.

[31]  Alex Aussem,et al.  A Conservative Feature Subset Selection Algorithm with Missing Data , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[32]  Ching-Hsue Cheng,et al.  Revised Entropy Clustering Analysis with Features Selection , 2005, Australian Conference on Artificial Intelligence.

[33]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[34]  Peter J. Thorburn,et al.  A dual-head attention model for time series data imputation , 2021, Comput. Electron. Agric..

[35]  Shichao Zhang,et al.  Clustering-based Missing Value Imputation for Data Preprocessing , 2006, 2006 4th IEEE International Conference on Industrial Informatics.

[36]  P. Georgopoulos,et al.  Gaussian mixture clustering and imputation of microarray data , 2004, Bioinform..