ManifoldBoost: stagewise function approximation for fully-, semi- and un-supervised learning

We describe a manifold learning framewor that naturally accommodates supervised learning, partially supervised learning and unsupervised clustering as particular cases. Our method chooses a function by minimizing loss subject to a manifold regularization penalty. This augmented cost is minimized using a greedy, stagewise, functional minimization procedure, as in Gradientboost. Each stage of boosting is fast and efficient. We demonstrate our approach using both radial basis function approximations and trees. The performance of our method is at the state of the art on many standard semi-supervised learning benchmarks, and we produce results for large scale datasets.

[1]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[2]  S. Rippa,et al.  Numerical Procedures for Surface Fitting of Scattered Data by Radial Functions , 1986 .

[3]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[4]  Peter L. Bartlett,et al.  Boosting Algorithms as Gradient Descent , 1999, NIPS.

[5]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[6]  M. Friedman Greedy Fun tion Approximation : A Gradient Boosting , 1999 .

[7]  Leo Breiman,et al.  Prediction Games and Arcing Algorithms , 1999, Neural Computation.

[8]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[9]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[10]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[11]  Gunnar Rätsch,et al.  An Introduction to Boosting and Leveraging , 2002, Machine Learning Summer School.

[12]  Peter L. Bartlett,et al.  Generalization Error of Combined Classifiers , 2002, J. Comput. Syst. Sci..

[13]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[14]  D. Donoho,et al.  Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Cynthia Rudin,et al.  Boosting Based on a Smooth Margin , 2004, COLT.

[16]  Balázs Kégl,et al.  Boosting on Manifolds: Adaptive Regularization of Base Classifiers , 2004, NIPS.

[17]  Mikhail Belkin,et al.  Beyond the point cloud: from transductive to semi-supervised learning , 2005, ICML.

[18]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[19]  Robert E. Schapire,et al.  How boosting the margin can also boost classifier complexity , 2006, ICML.

[20]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[21]  P. Bühlmann,et al.  Sparse Boosting , 2006, J. Mach. Learn. Res..

[22]  Vikas Sindhwani,et al.  The Geometric Basis of Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[23]  Ke Chen,et al.  Regularized Boost for Semi-Supervised Learning , 2007, NIPS.

[24]  Larry A. Wasserman,et al.  Statistical Analysis of Semi-Supervised Regression , 2007, NIPS.

[25]  P. Bickel,et al.  Local polynomial regression on unknown manifolds , 2007, 0708.0983.