Boosting Classifiers Built from Different Subsets of Features

We focus on the adaptation of boosting to representation spaces composed of different subsets of features. Rather than imposing a single weak learner to handle data that could come from different sources (e.g., images and texts and sounds), we suggest the decomposition of the learning task into several dependent sub-problems of boosting, treated by different weak learners, that will optimally collaborate during the weight update stage. To achieve this task, we introduce a new weighting scheme for which we provide theoretical results. Experiments are carried out and show that ourmethod works significantly better than any combination of independent boosting procedures.

[1]  Kevin J. Cherkauer Human Expert-level Performance on a Scientiic Image Analysis Task by a System Using Combined Artiicial Neural Networks , 1996 .

[2]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[3]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[4]  Amaury Habrard,et al.  Learning Rational Stochastic Languages , 2006, COLT.

[5]  Tan Yee Fan,et al.  A Tutorial on Support Vector Machine , 2009 .

[6]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[7]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[8]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[9]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[10]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[11]  Colin de la Higuera,et al.  A bibliographical study of grammatical inference , 2005, Pattern Recognit..

[12]  Gunnar Rätsch,et al.  An Introduction to Boosting and Leveraging , 2002, Machine Learning Summer School.

[13]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[14]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[15]  João Gama,et al.  Cascade Generalization , 2000, Machine Learning.

[16]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[17]  Gérard Chollet,et al.  BIOMET: A Multimodal Person Authentication Database Including Face, Voice, Fingerprint, Hand and Signature Modalities , 2003, AVBPA.

[18]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[19]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[20]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[21]  V. Koltchinskii,et al.  Empirical margin distributions and bounding the generalization error of combined classifiers , 2002, math/0405343.

[22]  Pierre Dupont,et al.  Inducing Hidden Markov Models to Model Long-Term Dependencies , 2005, ECML.

[23]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[24]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[25]  Joshua Goodman,et al.  A bit of progress in language modeling , 2001, Comput. Speech Lang..