Deep Gated Recurrent and Convolutional Network Hybrid Model for Univariate Time Series Classification

Hybrid LSTM-fully convolutional networks (LSTM-FCN) for time series classification have produced state-of-the-art classification results on univariate time series. We empirically show that replacing the LSTM with a gated recurrent unit (GRU) to create a GRU-fully convolutional network hybrid model (GRU-FCN) can offer even better performance on many time series datasets without further changes to the model. Our empirical study showed that the proposed GRU-FCN model also outperforms the state-of-the-art classification performance in many univariate time series datasets without additional supporting algorithms requirement. Furthermore, since the GRU uses simpler architecture than the LSTM, it has fewer training parameters, less training time, smaller memory storage requirements, and simpler hardware implementation, compared to the LSTM-based models.

[1]  Misha Denil,et al.  Noisy Activation Functions , 2016, ICML.

[2]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  Houshang Darabi,et al.  LSTM Fully Convolutional Networks for Time Series Classification , 2017, IEEE Access.

[4]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[5]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[6]  R. Lowry,et al.  Concepts and Applications of Inferential Statistics , 2014 .

[7]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[8]  Mustafa Gul,et al.  Statistical pattern recognition for Structural Health Monitoring using time series modeling: Theory and experimental verifications , 2009 .

[9]  T. Pohlert The Pairwise Multiple Comparison of Mean Ranks Package (PMCMR) , 2016 .

[10]  Jason Lines,et al.  Time series classification with ensembles of elastic distance measures , 2015, Data Mining and Knowledge Discovery.

[11]  Olufemi A. Omitaomu,et al.  Weighted dynamic time warping for time series classification , 2011, Pattern Recognit..

[12]  Jonathan D. Cryer,et al.  Time Series Analysis , 1986, Encyclopedia of Big Data.

[13]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[16]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[17]  Jean Ponce,et al.  A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[18]  Yixin Chen,et al.  Multi-Scale Convolutional Neural Networks for Time Series Classification , 2016, ArXiv.

[19]  J. Rotton,et al.  Air pollution, weather, and violent crimes: concomitant time-series analysis of archival data. , 1985, Journal of personality and social psychology.

[20]  Jason Lines,et al.  Time-Series Classification with COTE: The Collective of Transformation-Based Ensembles , 2015, IEEE Transactions on Knowledge and Data Engineering.

[21]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[22]  Markus Neuhäuser,et al.  Wilcoxon Signed Rank Test , 2006 .

[23]  George C. Runger,et al.  A Bag-of-Features Framework to Classify Time Series , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Hoon Sohn,et al.  Damage diagnosis using time series analysis of vibration signals , 2001 .

[25]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[26]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[27]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[28]  Chih-Hsiang Ho,et al.  Time Series Analysis for Predicting the Occurrences of Large Scale Earthquakes , 2012 .

[29]  Patrick Schäfer The BOSS is concerned with time series classification in the presence of noise , 2014, Data Mining and Knowledge Discovery.

[30]  Tim Oates,et al.  Time series classification from scratch with deep neural networks: A strong baseline , 2016, 2017 International Joint Conference on Neural Networks (IJCNN).

[31]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.