seq2graph: Discovering Dynamic Non-linear Dependencies from Multivariate Time Series

Discovering temporal lagged and inter-dependencies in multivariate time series data is an important task. However, in many real-world applications with big data, such as commercial cloud management or predictive maintenance in manufacturing, such dependencies can be time-variant and non-linear, which makes it more challenging to extract such dependencies through traditional methods like Granger causality or statistical models. In this work, we present a novel deep learning model that uses multiple layers of adapted gated recurrent units (GRUs) for discovering both time lagged behaviors and inter-timeseries dependencies, representing them in the form of directed weighted graphs. Each individual time series is first analyzed by a pair of encoding-decoding GRUs in order to discover the time lagged dependencies and representing its samples as high dimensional vectors. Such vectors collected from all component time series are then analyzed by a decoding network component to discover inter-dependencies across all time series while forecasting their next values in the multivariate time series. Though the discovery of two types of dependencies are separated at two levels of our neural network, they are tightly connected and jointly trained in an end-to-end manner. With this joint training, improvement in learning of one type of dependency immediately impacts the learning process of the other one, leading to the overall highly accurate dependencies discovery. We empirically test our model on synthetic time series data in which the exact form of dependencies are known. We also practically evaluate its performance on two real-world applications, (i) dynamic multivariate performance monitoring data with high volatility from a commercial cloud provider and, (ii) multivariate time series generated by sensors for a manufacturing plant. We show how our approach is capable of capturing these dependency behaviors via intuitive and interpretable dependency graphs and use them to generate forecasting values.

[1]  Germain Forestier,et al.  Deep learning for time series classification: a review , 2018, Data Mining and Knowledge Discovery.

[2]  Guokun Lai,et al.  Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks , 2017, SIGIR.

[3]  Mehdi Khashei,et al.  A novel hybridization of artificial neural networks and ARIMA models for time series forecasting , 2011, Appl. Soft Comput..

[4]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[5]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[6]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[7]  Zhaohui Wu,et al.  CloudScout: A Non-Intrusive Approach to Service Dependency Discovery , 2017, IEEE Transactions on Parallel and Distributed Systems.

[8]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[9]  Kishan G. Mehrotra,et al.  Forecasting the behavior of multivariate time series using neural networks , 1992, Neural Networks.

[10]  Petros Zerfos,et al.  Root Cause Detection using Dynamic Dependency Graphs from Time Series Data , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[11]  Guoqiang Peter Zhang,et al.  Time series forecasting using a hybrid ARIMA and neural network model , 2003, Neurocomputing.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Georges Badr,et al.  Medical Data Mining for Heart Diseases and the Future of Sequential Mining in Medical Field , 2018, Machine Learning Paradigms.

[14]  Jimeng Sun,et al.  RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism , 2016, NIPS.

[15]  Jianzhou Wang,et al.  A hybrid forecasting approach applied to wind speed time series , 2013 .

[16]  Bahram Choubin,et al.  Multiple linear regression, multi-layer perceptron network and adaptive neuro-fuzzy inference system for forecasting precipitation based on large-scale climate signals , 2016 .

[17]  Daniel Hsu,et al.  Time Series Forecasting Based on Augmented Long Short-Term Memory , 2017, ArXiv.

[18]  Petros Zerfos,et al.  Unsupervised Threshold Autoencoder to Analyze and Understand Sentence Elements , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[19]  Jeffrey Dean,et al.  Scalable and accurate deep learning with electronic health records , 2018, npj Digital Medicine.

[20]  X. Y. Chen,et al.  A comparative study of population-based optimization algorithms for downstream river flow forecasting by a hybrid neural network model , 2015, Eng. Appl. Artif. Intell..

[21]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[22]  Saeed Zolfaghari,et al.  Chaotic time series prediction with residual analysis method using hybrid Elman-NARX neural networks , 2010, Neurocomputing.

[23]  Stéphane Marchand-Maillet,et al.  Learning Predictive Leading Indicators for Forecasting Time Series Systems with Unknown Clusters of Forecast Tasks , 2017, ACML.

[24]  Songwu Lu,et al.  Dependency analysis of cloud applications for performance monitoring using recurrent neural networks , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[25]  Michael Small,et al.  Complex network analysis of time series , 2016 .

[26]  Shahrokh Valaee,et al.  Recent Advances in Recurrent Neural Networks , 2017, ArXiv.

[27]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[28]  Richard A. Davis,et al.  Introduction to time series and forecasting , 1998 .

[29]  Yoshiyasu Tamura,et al.  Using the ensemble Kalman filter for electricity load forecasting and analysis , 2016 .

[30]  N. Wichitaksorn Analyzing multiple vector autoregressions through matrix-variate normal distribution with two covariance matrices , 2019, Communications in Statistics - Theory and Methods.

[31]  Xiaomin Song,et al.  RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series , 2018, AAAI.

[32]  Arindam Banerjee,et al.  R2N2: Residual Recurrent Neural Networks for Multivariate Time Series Forecasting , 2017, ArXiv.

[33]  M. Eichler Granger causality and path diagrams for multivariate time series , 2007 .

[34]  Zhuang Wang,et al.  Log-based predictive maintenance , 2014, KDD.

[35]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[36]  Arshian Sharif,et al.  Time frequency relationship between energy consumption, economic growth and environmental degradation in the United States: Evidence from transportation sector , 2019, Energy.

[37]  Liljana Ferbar Tratar,et al.  The comparison of Holt–Winters method and Multiple regression method: A case study , 2016 .

[38]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..