An Infinite Deep Boltzmann Machine

The deep Boltzmann machine (DBM) is a powerful "deep" probabilistic model which learns a hierarchical representation of the data. However, choosing the size of each hidden layer of a DBM is difficult as the proper size of the model varies according to different tasks. Choosing a proper model size is a essential model selection problem for latent variable graphical models. This paper provides a new variant of DBM, called the infinite deep Boltzmann machine (iDBM), which can freely change the number of hidden units participating in the energy function of each layer. A greedy training method is proposed to pre-train our model, after which the size of each layer is fixed, and the model is transferred into an ordinary DBM. Experimental results on MNIST and CalTech101 Silhouettes indicate that iDBM can learn a generative and discriminative model as good as the original DBM, and has successfully eliminated the requirement of model selection for hidden layer sizes of DBMs.

[1]  Geoffrey E. Hinton,et al.  An Efficient Learning Procedure for Deep Boltzmann Machines , 2012, Neural Computation.

[2]  Yoshua Bengio,et al.  Joint Training Deep Boltzmann Machines for Classification , 2013, ICLR.

[3]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[4]  Ruslan Salakhutdinov,et al.  Learning Deep Generative Models , 2009 .

[5]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[6]  Tapani Raiko,et al.  Gaussian-Bernoulli deep Boltzmann machine , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[7]  Nando de Freitas,et al.  Inductive Principles for Restricted Boltzmann Machine Learning , 2010, AISTATS.

[8]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[9]  Xiang Li,et al.  On better training the infinite restricted Boltzmann machines , 2017, Machine Learning.

[10]  Tien D. Bui,et al.  Longitudinal Face Modeling via Temporal Deep Restricted Boltzmann Machines , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[12]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[13]  Hugo Larochelle,et al.  An Infinite Restricted Boltzmann Machine , 2015, Neural Computation.

[14]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[15]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[16]  Tapani Raiko,et al.  Enhanced Gradient for Training Restricted Boltzmann Machines , 2013, Neural Computation.

[17]  Yoshua Bengio,et al.  On Training Deep Boltzmann Machines , 2012, ArXiv.

[18]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[19]  Nitish Srivastava,et al.  Modeling Documents with Deep Boltzmann Machines , 2013, UAI.

[20]  Geoffrey E. Hinton,et al.  A Better Way to Pretrain Deep Boltzmann Machines , 2012, NIPS.

[21]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[22]  Nitish Srivastava,et al.  Modeling Documents with Deep Boltzmann Machines , 2013, UAI.