Gradient Regularized Budgeted Boosting

As machine learning transitions increasingly towards real world applications controlling the test-time cost of algorithms becomes more and more crucial. Recent work, such as the Greedy Miser and Speedboost, incorporate test-time budget constraints into the training procedure and learn classifiers that provably stay within budget (in expectation). However, so far, these algorithms are limited to the supervised learning scenario where sufficient amounts of labeled data are available. In this paper we investigate the common scenario where labeled data is scarce but unlabeled data is available in abundance. We propose an algorithm that leverages the unlabeled data (through Laplace smoothing) and learns classifiers with budget constraints. Our model, based on gradient boosted regression trees (GBRT), is, to our knowledge, the first algorithm for semi-supervised budgeted learning.

[1]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[2]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[3]  Zenglin Xu,et al.  Discriminative Semi-Supervised Feature Selection Via Manifold Regularization , 2009, IEEE Transactions on Neural Networks.

[4]  Matt J. Kusner,et al.  Anytime Representation Learning , 2013, ICML.

[5]  Per-Olof Persson,et al.  Implicit Large-Eddy Simulation of 2D Counter-Rotating Vertical-Axis Wind Turbines , 2016 .

[6]  Kilian Q. Weinberger,et al.  Classifier Cascade for Minimizing Feature Evaluation Cost , 2012, AISTATS.

[7]  Akito Sakurai,et al.  Manifold-Regularized Minimax Probability Machine , 2011, PSL.

[8]  Berkant Barla Cambazoglu,et al.  Early exit optimizations for additive machine learned ranking systems , 2010, WSDM '10.

[9]  Horst Bischof,et al.  SERBoost: Semi-supervised Boosting with Expectation Regularization , 2008, ECCV.

[10]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[11]  X. Wang,et al.  Predicting hepatitis B virus–positive metastatic hepatocellular carcinomas using gene expression profiling and supervised machine learning , 2003, Nature Medicine.

[12]  Horst Bischof,et al.  Semi-supervised On-Line Boosting for Robust Tracking , 2008, ECCV.

[13]  Matt J. Kusner,et al.  Cost-Sensitive Tree of Classifiers , 2012, ICML.

[14]  Per-Olof Persson,et al.  High-order Discontinuous Galerkin Simulations on Moving Domains using ALE Formulations and Local Remeshing and Projections , 2015 .

[15]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[16]  Luming Wang,et al.  Discontinuous Galerkin Methods on Moving Domains with Large Deformations , 2015 .

[17]  Joaquin Quiñonero Candela,et al.  Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..

[18]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[19]  Jason Eisner,et al.  Cost-sensitive Dynamic Feature Selection , 2012 .

[20]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Venkatesh Saligrama,et al.  An LP for Sequential Learning Under Budgets , 2014, AISTATS.

[22]  Trevor Darrell,et al.  Anytime Recognition of Objects and Scenes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[24]  Eftychios Sifakis,et al.  A second order virtual node method for elliptic problems with interfaces and irregular domains in three dimensions , 2012, J. Comput. Phys..

[25]  Venkatesh Saligrama,et al.  Multi-stage classifier design , 2012, Machine Learning.

[26]  Ke Chen,et al.  Regularized Boost for Semi-Supervised Learning , 2007, NIPS.

[27]  L. L. Cam,et al.  Asymptotic Methods In Statistical Decision Theory , 1986 .

[28]  Hongyuan Zha,et al.  A General Boosting Method and its Application to Learning Ranking Functions for Web Search , 2007, NIPS.

[29]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[30]  Balázs Kégl,et al.  Fast classification using sparse decision DAGs , 2012, ICML.

[31]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[32]  Ya Zhang,et al.  Boosted multi-task learning , 2010, Machine Learning.

[33]  Trevor Darrell,et al.  Timely Object Recognition , 2012, NIPS.

[34]  J. Andrew Bagnell,et al.  SpeedBoost: Anytime Prediction with Uniform Near-Optimality , 2012, AISTATS.

[35]  Kilian Q. Weinberger,et al.  The Greedy Miser: Learning under Test-time Budgets , 2012, ICML.

[36]  Lev Reyzin,et al.  Boosting on a Budget: Sampling for Feature-Efficient Prediction , 2011, ICML.

[37]  Daphne Koller,et al.  Active Classification based on Value of Classifier , 2011, NIPS.

[38]  Yi Liu,et al.  SemiBoost: Boosting for Semi-Supervised Learning , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Per-Olof Persson,et al.  A Discontinuous Galerkin Method for the Navier-Stokes Equations on Deforming Domains using Unstructured Moving Space-Time Meshes , 2013 .

[40]  Yi Chang,et al.  Yahoo! Learning to Rank Challenge Overview , 2010, Yahoo! Learning to Rank Challenge.

[41]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[42]  Luming Wang,et al.  A high-order discontinuous Galerkin method with unstructured space–time meshes for two-dimensional compressible flows on domains with large deformations , 2015 .

[43]  Honglak Lee,et al.  Efficient L1 Regularized Logistic Regression , 2006, AAAI.