论文信息 - Classification using discriminative restricted Boltzmann machines

Classification using discriminative restricted Boltzmann machines

Recently, many applications for Restricted Boltzmann Machines (RBMs) have been developed for a large variety of learning problems. However, RBMs are usually used as feature extractors for another learning algorithm or to provide a good initialization for deep feed-forward neural network classifiers, and are not considered as a standalone solution to classification problems. In this paper, we argue that RBMs provide a self-contained framework for deriving competitive non-linear classifiers. We present an evaluation of different learning algorithms for RBMs which aim at introducing a discriminative component to RBM training and improve their performance as classifiers. This approach is simple in that RBMs are used directly to build a classifier, rather than as a stepping stone. Finally, we demonstrate how discriminative RBMs can also be successfully employed in a semi-supervised setting.

Yoshua Bengio | Hugo Larochelle | Yoshua Bengio | H. Larochelle

[1] Paul Smolensky,et al. Information processing in dynamical systems: foundations of harmony theory , 1986 .

[2] David Haussler,et al. Unsupervised learning of distributions on binary vectors using two layer networks , 1991, NIPS 1991.

[3] Michael I. Jordan,et al. On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[4] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[5] Zoubin Ghahramani,et al. Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[6] Geoffrey E. Hinton,et al. Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[7] Guillaume Bouchard,et al. The Tradeoff Between Generative and Discriminative Classifiers , 2004 .

[8] Rong Yan,et al. Mining Associated Text and Images with Dual-Wing Harmoniums , 2005, UAI.

[9] Nicolas Le Roux,et al. The Curse of Highly Variable Functions for Local Kernel Machines , 2005, NIPS.

[10] Miguel Á. Carreira-Perpiñán,et al. On Contrastive Divergence Learning , 2005, AISTATS.

[11] Peter V. Gehler,et al. The rate adapting poisson model for information retrieval and object recognition , 2006, ICML.

[12] Christopher Joseph Pal,et al. Multi-Conditional Learning: Generative/Discriminative Training for Clustering and Classification , 2006, AAAI.

[13] Alexander Zien,et al. Semi-Supervised Learning , 2006 .

[14] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[15] Nicolas Le Roux,et al. Label Propagation and Quadratic Criterion , 2006, Semi-Supervised Learning.

[16] Bernhard Schölkopf,et al. Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[17] Tom Minka,et al. Principled Hybrids of Generative and Discriminative Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18] Geoffrey E. Hinton,et al. Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[19] Honglak Lee,et al. Sparse deep belief net model for visual area V2 , 2007, NIPS.

[20] Thomas Hofmann,et al. Greedy Layer-Wise Training of Deep Networks , 2007 .

[21] Geoffrey E. Hinton,et al. To recognize shapes, first learn to generate images. , 2007, Progress in brain research.

[22] B. Schölkopf,et al. Modeling Human Motion Using Binary Latent Variables , 2007 .

[23] Christopher Joseph Pal,et al. Semi-supervised classification with hybrid generative/discriminative methods , 2007, KDD '07.

[24] Yoshua Bengio,et al. An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[25] Geoffrey E. Hinton,et al. Learning Multilevel Distributed Representations for High-Dimensional Sequences , 2007, AISTATS.

[26] Nicolas Le Roux,et al. Representational Power of Restricted Boltzmann Machines and Deep Belief Networks , 2008, Neural Computation.