论文信息 - Self Supervised Boosting

Self Supervised Boosting

Boosting algorithms and successful applications thereof abound for classification and regression learning problems, but not for unsupervised learning. We propose a sequential approach to adding features to a random field model by training them to improve classification performance between the data and an equal-sized sample of "negative examples" generated from the model's current estimate of the data density. Training in each boosting round proceeds in three stages: first we sample negative examples from the model's current Boltzmann distribution. Next, a feature is trained to improve classification performance between data and negative examples. Finally, a coefficient is learned which determines the importance of this feature relative to ones already in the pool. Negative examples only need to be generated once to learn each new feature. The validity of the approach is demonstrated on binary digits and continuous synthetic data.

Geoffrey E. Hinton | Max Welling | Richard S. Zemel | R. Zemel | M. Welling

[1] David Haussler,et al. Unsupervised learning of distributions on binary vectors using two layer networks , 1991, NIPS 1991.

[2] John D. Lafferty,et al. Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[3] Song-Chun Zhu,et al. Minimax Entropy Principle and Its Application to Texture Modeling , 1997, Neural Computation.

[4] Yoram Singer,et al. Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[5] Peter L. Bartlett,et al. Boosting Algorithms as Gradient Descent , 1999, NIPS.

[6] Geoffrey E. Hinton,et al. Spiking Boltzmann Machines , 1999, NIPS.

[7] Y. Freund,et al. Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[8] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .

[9] John D. Lafferty,et al. Boosting and Maximum Likelihood for Exponential Models , 2001, NIPS.

[10] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[11] Saharon Rosset,et al. Boosting Density Estimation , 2002, NIPS.