Modeling Deep Temporal Dependencies with Recurrent "Grammar Cells"

We propose modeling time series by representing the transformations that take a frame at time t to a frame at time t+1. To this end we show how a bi-linear model of transformations, such as a gated autoencoder, can be turned into a recurrent network, by training it to predict future frames from the current one and the inferred transformation using backprop-through-time. We also show how stacking multiple layers of gating units in a recurrent pyramid makes it possible to represent the "syntax" of complicated time series, and that it can outperform standard recurrent neural networks in terms of prediction accuracy on a variety of tasks.

[1]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[2]  Peter Földiák,et al.  Learning Invariance from Transformation Sequences , 1991, Neural Comput..

[3]  Sepp Hochreiter,et al.  Untersuchungen zu dynamischen neuronalen Netzen , 1991 .

[4]  Jürgen Schmidhuber,et al.  Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.

[5]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[6]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[7]  Terrence J. Sejnowski,et al.  Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.

[8]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[9]  Geoffrey E. Hinton,et al.  Unsupervised Learning of Image Transformations , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  B. Schölkopf,et al.  Modeling Human Motion Using Binary Latent Variables , 2007 .

[11]  Bruno A. Olshausen,et al.  Bilinear models of natural images , 2007, Electronic Imaging.

[12]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[13]  Geoffrey E. Hinton,et al.  The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.

[14]  Geoffrey E. Hinton,et al.  Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines , 2010, Neural Computation.

[15]  Geoffrey E. Hinton,et al.  Two Distributed-State Models For Generating High-Dimensional Time Series , 2011, J. Mach. Learn. Res..

[16]  Geoffrey E. Hinton,et al.  Generating Text with Recurrent Neural Networks , 2011, ICML.

[17]  Roland Memisevic,et al.  Gradient-based learning of higher-order image features , 2011, 2011 International Conference on Computer Vision.

[18]  Learning to relate images. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[19]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[20]  Roland Memisevic,et al.  Learning invariant features by harnessing the aperture problem , 2013, ICML.

[21]  Tapani Raiko,et al.  Linear State-Space Model with Time-Varying Dynamics , 2014, ECML/PKDD.