Empirical Activation Function Effects on Unsupervised Convolutional LSTM Learning

This paper empirically evaluates and analyzes the effect of the choice of recurrent activation and unit activation functions on the unsupervised convolutional LSTM learning process. The goal of this work is to provide guidance for selecting the optimal non-linear activation function for the convolutional LSTM models which target the video prediction problem. This paper shows an empirical analysis of different non-linear activation functions that are commonly implemented in different deep learning APIs. We used the moving MNIST dataset as the most common benchmark for video prediction problems.

[1]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[2]  Jürgen Schmidhuber,et al.  Learning to forget: continual prediction with LSTM , 1999 .

[3]  Svetha Venkatesh,et al.  DeepCare: A Deep Dynamic Memory Model for Predictive Medicine , 2016, PAKDD.

[4]  Misha Denil,et al.  Noisy Activation Functions , 2016, ICML.

[5]  Nitish Srivastava,et al.  Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.

[6]  Yoshua Bengio,et al.  Série Scientifique Scientific Series Incorporating Second-order Functional Knowledge for Better Option Pricing Incorporating Second-order Functional Knowledge for Better Option Pricing , 2022 .

[7]  Martial Schtickzelle,et al.  Pierre-François Verhulst (1804-1849). La première découverte de la fonction logistique , 1981 .

[8]  Gabriel Kreiman,et al.  Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning , 2016, ICLR.

[9]  A. Bovik,et al.  A universal image quality index , 2002, IEEE Signal Processing Letters.

[10]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[11]  J. Schmidhuber,et al.  A First Look at Music Composition using LSTM Recurrent Neural Networks , 2002 .

[12]  Yee Whye Teh,et al.  Rate-coded Restricted Boltzmann Machines for Face Recognition , 2000, NIPS.

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Stephen Grossberg,et al.  Recurrent neural networks , 2013, Scholarpedia.

[15]  Navdeep Jaitly,et al.  Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.

[16]  Jürgen Schmidhuber,et al.  Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..

[17]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[18]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[19]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[20]  Stefan C. Kremer,et al.  Recurrent Neural Networks , 2013, Handbook on Neural Information Processing.

[21]  Navdeep Jaitly,et al.  Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[22]  Christian Wolf,et al.  Sequential Deep Learning for Human Action Recognition , 2011, HBU.

[23]  Jianxin Wu,et al.  Minimal gated unit for recurrent neural networks , 2016, International Journal of Automation and Computing.