Temporal Autoencoding Restricted Boltzmann Machine

Much work has been done refining and characterizing the receptive fields learned by deep learning algorithms. A lot of this work has focused on the development of Gabor-like filters learned when enforcing sparsity constraints on a natural image dataset. Little work however has investigated how these filters might expand to the temporal domain, namely through training on natural movies. Here we investigate exactly this problem in established temporal deep learning algorithms as well as a new learning paradigm suggested here, the Temporal Autoencoding Restricted Boltzmann Machine (TARBM).

[1]  Joseph J Atick,et al.  Could information theory provide an ecological theory of sensory processing? , 2011, Network.

[2]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[3]  Cordelia Schmid,et al.  Actions in context , 2009, CVPR.

[4]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[5]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[6]  Morgane M. Roth,et al.  Representation of visual scenes by local neuronal populations in layer 2/3 of mouse visual cortex , 2011, Front. Neural Circuits.

[7]  Thomas Hofmann,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2007 .

[8]  Graham W. Taylor Composable, Distributed-state Models for High-dimensional Time Series , 2009 .

[9]  Geoffrey E. Hinton,et al.  The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.

[10]  B. Schölkopf,et al.  Modeling Human Motion Using Binary Latent Variables , 2007 .

[11]  Eero P. Simoncelli,et al.  Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons , 2011, NIPS.

[12]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[13]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[15]  Tara N. Sainath,et al.  Deep Belief Networks using discriminative features for phone recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[17]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[18]  A. Borst Seeing smells: imaging olfactory learning in bees , 1999, Nature Neuroscience.

[19]  Bruno A. Olshausen,et al.  Learning Intermediate-Level Representations of Form and Motion from Natural Movies , 2012, Neural Computation.

[20]  Laurenz Wiskott,et al.  A Theory of Slow Feature Analysis for Transformation-Based Input Signals with an Application to Complex Cells , 2011, Neural Computation.

[21]  Michael S. Lewicki,et al.  Efficient coding of natural sounds , 2002, Nature Neuroscience.

[22]  Geoffrey E. Hinton,et al.  Learning Multilevel Distributed Representations for High-Dimensional Sequences , 2007, AISTATS.

[23]  Andrew Y. Ng,et al.  Unsupervised learning models of primary cortical receptive fields and receptive field plasticity , 2011, NIPS.

[24]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[25]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.