论文信息 - FeatureBoost: A Meta-Learning Algorithm that Improves Model Robustness

FeatureBoost: A Meta-Learning Algorithm that Improves Model Robustness

Most machine learning algorithms are lazy: they extract from the training set the minimum information needed to predict its labels. Unfortunately, this often leads to models that are not robust when features are removed or obscured in future test data. For example, a backprop net trained to steer a car typically learns to recognize the edges of the road, but does not learn to recognize other features such as the stripes painted on the road which could be useful when road edges disappear in tunnels or are obscured by passing trucks. The net learns the minimum necessary to steer on the training set. In contrast, human driving is remarkably robust as features become obscured. Motivated by this, we propose a framework for robust learning that biases induction to learn many different models from the same inputs. We present a meta algorithm for robust learning called FeatureBoost, and demonstrate it on several problems using backprop nets, k-nearest neighbor, and decision trees.

J. Langford | A. Blum | R. Caruana | Joseph O'Sullivan

[1] Wray L. Buntine,et al. Learning classification trees , 1992 .

[2] Shai Ben-David,et al. Learning with restricted focus of attention , 1993, COLT '93.

[3] Dean A. Pomerleau,et al. Neural Network Perception for Mobile Robot Guidance , 1993 .

[4] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[5] Constantin F. Aliferis,et al. An evaluation of machine-learning methods for predicting pneumonia mortality , 1997, Artif. Intell. Medicine.

[6] Yoram Singer,et al. Using and combining predictors that specialize , 1997, STOC '97.

[7] Stephen D. Bay. Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets , 1998, ICML.

[8] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.

[9] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.