Class sparsity signature based Restricted Boltzmann Machine

Abstract Restricted Boltzmann Machines (RBMs) have been extensively utilized in machine learning as core units in constructing deep learning architectures such as Deep Boltzmann Machines (DBMs) and Deep Belief Networks (DBNs). However, they are prone to overfitting and several regularization techniques have been proposed to mitigate this effect. In this paper, we propose the semi-supervised class sparsity signature based RBM formulation by combining unsupervised generative training of the RBM with a supervised sparsity regularizer. The proposed approach, termed as cssRBM, enforces sparsity at the class level to ensure that coherent and discriminative representations are learnt during training. Combining unsupervised learning with supervised learning allows the model to utilize external training data to learn better generative features while the supervised learning enables fine-tuning for discrimination using the learned features. We construct both DBMs and DBNs with cssRBM units and evaluate the performance on multiple publicly available benchmark datasets. Experiments on the MNIST and CIFAR-10 databases demonstrate that the proposed approaches are comparable with state-of-the-art deep learning architectures in the literature. We also evaluate the performance on one of the most challenging face databases, i.e., the Point and Shoot Challenge dataset. The results show that the proposed approaches improve state-of-the-art results by 15% on the PaSC database.

[1]  Razvan Pascanu,et al.  Learning Algorithms for the Classification Restricted Boltzmann Machine , 2012, J. Mach. Learn. Res..

[2]  Richard G. Baraniuk,et al.  Controlling False Alarms With Support Vector Machines , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[3]  Lutz Prechelt,et al.  Early Stopping - But When? , 2012, Neural Networks: Tricks of the Trade.

[4]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[5]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[6]  Geoffrey E. Hinton,et al.  Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.

[7]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[8]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[9]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[10]  Chun-Xia Zhang,et al.  A sparse-response deep belief network based on rate distortion theory , 2014, Pattern Recognit..

[11]  Ruimin Shen,et al.  Sparse Group Restricted Boltzmann Machines , 2010, AAAI.

[12]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[13]  Brendan McCane,et al.  Deep Networks are Effective Encoders of Periodicity , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Bruce A. Draper,et al.  An introduction to the good, the bad, & the ugly face recognition challenge problem , 2011, Face and Gesture 2011.

[15]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[16]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[17]  Yan Liu,et al.  Discriminative deep belief networks for visual data classification , 2011, Pattern Recognit..

[18]  Yoshua Bengio,et al.  Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.

[19]  Bhaskar D. Rao,et al.  Sparse solutions to linear inverse problems with multiple measurement vectors , 2005, IEEE Transactions on Signal Processing.

[20]  Christian Igel,et al.  Training restricted Boltzmann machines: An introduction , 2014, Pattern Recognit..

[21]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[22]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[23]  Anil K. Jain,et al.  Handbook of Face Recognition, 2nd Edition , 2011 .

[24]  Ling Shao,et al.  Learning Deep and Wide: A Spectral Method for Learning Deep Networks , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Richa Singh,et al.  MDLFace: Memorability augmented deep learning for video face recognition , 2014, IEEE International Joint Conference on Biometrics.

[26]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[27]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[28]  Bruce A. Draper,et al.  The challenge of face recognition from digital point-and-shoot cameras , 2013, 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS).