Deep Boltzmann Machines as Feed-Forward Hierarchies

The deep Boltzmann machine is a powerful model that extracts the hierarchical structure of observed data. While inference is typically slow due to its undirected nature, we argue that the emerging feature hierarchy is still explicit enough to be traversed in a feedforward fashion. The claim is corroborated by training a set of deep neural networks on real data and measuring the evolution of the representation layer after layer. The analysis reveals that the deep Boltzmann machine produces a feed-forward hierarchy of increasingly invariant representations that clearly surpasses the layer-wise approach.

[1]  Radford M. Neal Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[2]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[3]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[4]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[5]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[6]  David W. Aha,et al.  Unsupervised and transfer learning challenge , 2011, The 2011 International Joint Conference on Neural Networks.

[7]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[9]  Peter Glöckner,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2013 .

[10]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[11]  Klaus-Robert Müller,et al.  Kernel Analysis of Deep Networks , 2011, J. Mach. Learn. Res..

[12]  John J. Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities , 1999 .

[13]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[14]  Laurens van der Maaten,et al.  A New Benchmark Dataset for Handwritten Character Recognition , 2009 .

[15]  Joachim M. Buhmann,et al.  On Relevant Dimensions in Kernel Feature Spaces , 2008, J. Mach. Learn. Res..

[16]  Tijmen Tieleman,et al.  Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.