Structured Recurrent Temporal Restricted Boltzmann Machines

The recurrent temporal restricted Boltzmann machine (RTRBM) is a probabilistic time-series model. The topology of the RTRBM graphical model, however, assumes full connectivity between all the pairs of visible units and hidden units, thereby ignoring the dependency structure within the observations. Learning this structure has the potential for not only improving the prediction performance, but also revealing important dependency patterns in the data. For example, given a meteorological dataset, we could identify regional weather patterns. In this work, we propose a new class of RTRBM, which we refer to as the structured RTRBM (SRTRBM), which explicitly uses a graph to model the dependency structure. Our technique is related to methods such as graphical lasso, which are used to learn the topology of Gaussian graphical models. We also develop a spike-and-slab version of the RTRBM, and combine it with the SRTRBM to learn dependency structures in datasets with real-valued observations. Our experimental results using synthetic and real datasets demonstrate that the SRTRBM can significantly improve the prediction performance of the RTRBM, particularly when the number of visible units is large and the size of the training set is small. It also reveals the dependency structures underlying our benchmark datasets.

[1]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[2]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[3]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[4]  Michael I. Jordan Graphical Models , 1998 .

[5]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[6]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[7]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[8]  Christian Kohlschein An introduction to Hidden Markov Models , 2007 .

[9]  Geoffrey E. Hinton,et al.  Learning Multilevel Distributed Representations for High-Dimensional Sequences , 2007, AISTATS.

[10]  Geoffrey E. Hinton,et al.  The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.

[11]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[12]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[13]  Geoffrey E. Hinton,et al.  Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine , 2010, NIPS.

[14]  James M. Rehg,et al.  Temporal causality for the analysis of visual events , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Geoffrey E. Hinton,et al.  Modeling pixel means and covariances using factorized third-order boltzmann machines , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Lieven Vandenberghe,et al.  Topology Selection in Graphical Models of Autoregressive Processes , 2010, J. Mach. Learn. Res..

[17]  Ilya Sutskever,et al.  Learning Recurrent Neural Networks with Hessian-Free Optimization , 2011, ICML.

[18]  Yoshua Bengio,et al.  Unsupervised Models of Images by Spikeand-Slab RBMs , 2011, ICML.

[19]  Yoshua Bengio,et al.  A Spike and Slab Restricted Boltzmann Machine , 2011, AISTATS.

[20]  Geoffrey E. Hinton,et al.  Two Distributed-State Models For Generating High-Dimensional Time Series , 2011, J. Mach. Learn. Res..

[21]  Joshua B. Tenenbaum,et al.  Infinite Dynamic Bayesian Networks , 2011, ICML.

[22]  Yoshua Bengio,et al.  Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.

[23]  Seungyeop Han,et al.  Structured Learning of Gaussian Graphical Models , 2012, NIPS.

[24]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[25]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[26]  Razvan Pascanu,et al.  Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.