Large-scale restricted boltzmann machines on single GPU

Recent works on deep belief network (DBNs) have shown that applying large-scale unsupervised feature learning model can dramatically improve the performance of the applications in many fields. Training billions of parameters in these models such as restricted boltzmann machines (RBMs) appears to be computational challenging for modern CPUs. Graphical Processing Units (GPUs) has been employed in many large-scale deep learning models for performance enhancement due to its massively parallel computing capability. Unfortunately, the limited device memory of GPUs imposes a restriction on the size of the model trained on a single GPU. Multi-GPUs approaches, on the other hand, suffer from inefficient communication and economic cost. In this paper, we proposed a novel memory efficient algorithm on single GPU that can train large-scale RBMs without size restriction and preserve the performance gain of GPU parallel computation. Particularly, the experiments demonstrated that our approach used 75% less memory storage at the cost of only 10% performance loss in training large-scale RBMs with billions of parameters.

[1]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Tao Wang,et al.  Deep learning with COTS HPC systems , 2013, ICML.

[3]  Rajat Raina,et al.  Large-scale deep unsupervised learning using graphics processors , 2009, ICML '09.

[4]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[5]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Luca Maria Gambardella,et al.  Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition , 2010, ArXiv.

[7]  Noel Lopes,et al.  Restricted Boltzmann Machines and Deep Belief Networks on multi-core processors , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[8]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[9]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[10]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[11]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[12]  Robert A. van de Geijn,et al.  High-performance implementation of the level-3 BLAS , 2008, TOMS.

[13]  Sven Behnke,et al.  Large-scale object recognition with CUDA-accelerated hierarchical neural networks , 2009, 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[14]  Nando de Freitas,et al.  A tutorial on stochastic approximation algorithms for training Restricted Boltzmann Machines and Deep Belief Nets , 2010, 2010 Information Theory and Applications Workshop (ITA).

[15]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[16]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[17]  Luca Maria Gambardella,et al.  Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.

[18]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Andrew Y. Ng,et al.  Emergence of Object-Selective Features in Unsupervised Feature Learning , 2012, NIPS.