Classification using discriminative restricted Boltzmann machines

Recently, many applications for Restricted Boltzmann Machines (RBMs) have been developed for a large variety of learning problems. However, RBMs are usually used as feature extractors for another learning algorithm or to provide a good initialization for deep feed-forward neural network classifiers, and are not considered as a standalone solution to classification problems. In this paper, we argue that RBMs provide a self-contained framework for deriving competitive non-linear classifiers. We present an evaluation of different learning algorithms for RBMs which aim at introducing a discriminative component to RBM training and improve their performance as classifiers. This approach is simple in that RBMs are used directly to build a classifier, rather than as a stepping stone. Finally, we demonstrate how discriminative RBMs can also be successfully employed in a semi-supervised setting.

[1]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[2]  David Haussler,et al.  Unsupervised learning of distributions on binary vectors using two layer networks , 1991, NIPS 1991.

[3]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[4]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[5]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[6]  Geoffrey E. Hinton,et al.  Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[7]  Guillaume Bouchard,et al.  The Tradeoff Between Generative and Discriminative Classifiers , 2004 .

[8]  Rong Yan,et al.  Mining Associated Text and Images with Dual-Wing Harmoniums , 2005, UAI.

[9]  Nicolas Le Roux,et al.  The Curse of Highly Variable Functions for Local Kernel Machines , 2005, NIPS.

[10]  Miguel Á. Carreira-Perpiñán,et al.  On Contrastive Divergence Learning , 2005, AISTATS.

[11]  Peter V. Gehler,et al.  The rate adapting poisson model for information retrieval and object recognition , 2006, ICML.

[12]  Christopher Joseph Pal,et al.  Multi-Conditional Learning: Generative/Discriminative Training for Clustering and Classification , 2006, AAAI.

[13]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[14]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[15]  Nicolas Le Roux,et al.  Label Propagation and Quadratic Criterion , 2006, Semi-Supervised Learning.

[16]  Bernhard Schölkopf,et al.  Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[17]  Tom Minka,et al.  Principled Hybrids of Generative and Discriminative Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[19]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[20]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[21]  Geoffrey E. Hinton,et al.  To recognize shapes, first learn to generate images. , 2007, Progress in brain research.

[22]  B. Schölkopf,et al.  Modeling Human Motion Using Binary Latent Variables , 2007 .

[23]  Christopher Joseph Pal,et al.  Semi-supervised classification with hybrid generative/discriminative methods , 2007, KDD '07.

[24]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[25]  Geoffrey E. Hinton,et al.  Learning Multilevel Distributed Representations for High-Dimensional Sequences , 2007, AISTATS.

[26]  Nicolas Le Roux,et al.  Representational Power of Restricted Boltzmann Machines and Deep Belief Networks , 2008, Neural Computation.