Fusion Recurrent Neural Network

Considering deep sequence learning for practical application, two representative RNNs - LSTM and GRU may come to mind first. Nevertheless, is there no chance for other RNNs? Will there be a better RNN in the future? In this work, we propose a novel, succinct and promising RNN - Fusion Recurrent Neural Network (Fusion RNN). Fusion RNN is composed of Fusion module and Transport module every time step. Fusion module realizes the multi-round fusion of the input and hidden state vector. Transport module which mainly refers to simple recurrent network calculate the hidden state and prepare to pass it to the next time step. Furthermore, in order to evaluate Fusion RNN's sequence feature extraction capability, we choose a representative data mining task for sequence data, estimated time of arrival (ETA) and present a novel model based on Fusion RNN. We contrast our method and other variants of RNN for ETA under massive vehicle travel data from DiDi Chuxing. The results demonstrate that for ETA, Fusion RNN is comparable to state-of-the-art LSTM and GRU which are more complicated than Fusion RNN.

[1]  Wei Guo,et al.  Urban Link Travel Time Prediction Based on a Gradient Boosting Method Considering Spatiotemporal Correlations , 2016, ISPRS Int. J. Geo Inf..

[2]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[3]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[4]  X. Chen,et al.  SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence , 2003, Nucleic Acids Res..

[5]  Roberto Navigli,et al.  Neural Sequence Learning Models for Word Sense Disambiguation , 2017, EMNLP.

[6]  Adam Coates,et al.  Cold Fusion: Training Seq2Seq Models Together with Language Models , 2017, INTERSPEECH.

[7]  Paul R. Cohen,et al.  Using Dynamic Time Warping to Bootstrap HMM-Based Clustering of Time Series , 2001, Sequence Learning.

[8]  Ming Zhou,et al.  Sequence-to-Dependency Neural Machine Translation , 2017, ACL.

[9]  Helmut Ltkepohl,et al.  New Introduction to Multiple Time Series Analysis , 2007 .

[10]  Daniel Kifer,et al.  A simple baseline for travel time estimation using large-scale trip data , 2015, SIGSPATIAL/GIS.

[11]  Alexandre M. Bayen,et al.  This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 1 Learning the Dynamics of Arterial Traffic From Probe , 2022 .

[12]  Navdeep Jaitly,et al.  Towards Better Decoding and Language Model Integration in Sequence to Sequence Models , 2016, INTERSPEECH.

[13]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[14]  John Krumm,et al.  Hidden Markov map matching through noise and sparseness , 2009, GIS.

[15]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[16]  Hesham Rakha,et al.  Dynamic Travel Time Prediction using Pattern Recognition , 2013 .

[17]  Alexander M. Rush,et al.  Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.

[18]  Zheng Wang,et al.  CompactETA: A Fast Inference System for Travel Time Prediction , 2020, KDD.

[19]  Xu Tan,et al.  MASS: Masked Sequence to Sequence Pre-training for Language Generation , 2019, ICML.

[20]  Yuwei Cui,et al.  Continuous Online Sequence Learning with an Unsupervised Neural Network Model , 2015, Neural Computation.

[21]  Xing Xie,et al.  T-Drive: Enhancing Driving Directions with Taxi Drivers' Intelligence , 2013, IEEE Transactions on Knowledge and Data Engineering.

[22]  C. Lee Giles,et al.  Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks , 1992, Neural Computation.

[23]  Jun Wang,et al.  Improving Attention Based Sequence-to-Sequence Models for End-to-End English Conversational Speech Recognition , 2018, INTERSPEECH.

[24]  Yurong Liu,et al.  A survey of deep neural network architectures and their applications , 2017, Neurocomputing.

[25]  Satish V. Ukkusuri,et al.  Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information , 2013 .

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  Sen Wang,et al.  End-to-end, sequence-to-sequence probabilistic visual odometry through deep neural networks , 2018, Int. J. Robotics Res..

[28]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[29]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[30]  Jürgen Schmidhuber,et al.  LSTM can Solve Hard Long Time Lag Problems , 1996, NIPS.

[31]  Tara N. Sainath,et al.  State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[32]  Shuang Xu,et al.  Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[33]  Zachary Chase Lipton A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[34]  Harris Drucker,et al.  Learning algorithms for classification: A comparison on handwritten digit recognition , 1995 .

[35]  Abdelaziz Berrado,et al.  Short Term Prediction Framework for Moroccan Stock Market Using Artificial Neural Networks , 2018, SITA.

[36]  Richard I. Ivry,et al.  Attention and Structure in Sequence Learning , 2004 .

[37]  Shuai Li,et al.  Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[39]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Zheng Wang,et al.  Learning to Estimate the Travel Time , 2018, KDD.

[41]  Jun Tani,et al.  Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment , 2008, PLoS Comput. Biol..

[42]  Shinji Watanabe,et al.  Multilingual Sequence-to-Sequence Speech Recognition: Architecture, Transfer Learning, and Language Modeling , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).

[43]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[44]  Phil Blunsom,et al.  Mogrifier LSTM , 2020, ICLR.

[45]  Weiwei Sun,et al.  DeepTravel: a Neural Network Based Travel Time Estimation Model with Auxiliary Supervision , 2018, IJCAI.

[46]  Ruofei Zhang,et al.  DeepProbe: Information Directed Sequence Understanding and Chatbot Design via Recurrent Neural Networks , 2017, KDD.

[47]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Wei Cao,et al.  When Will You Arrive? Estimating Travel Time Based on Deep Neural Networks , 2018, AAAI.

[49]  Michael I. Jordan Serial Order: A Parallel Distributed Processing Approach , 1997 .