Combining the advice of experts with randomized boosting for robust pattern recognition

We have developed an algorithm, called ShareBoost, for combining mulitple classifiers from multiple information sources. The algorithm offer a number of advantages, such as increased confidence in decision-making, resulting from combined complementary data, good performance against noise, and the ability to exploit interplay between sensor subspaces.We have also developed a randomized version of ShareBoost, called rShare-Boost, by casting ShareBoost within an adversarial multi-armed bandit framework. This in turn allows us to show rShareBoost is efficient and convergent. Both algorithms have shown promise in a number of applications. The hallmark of these algorithms is a set of strategies for mining and exploiting the most informative sensor sources for a given situation. These strategies are computations performed by the algorithms. In this paper, we propose to consider strategies as advice given to an algorithm by “experts” or “Oracle.” In the context of pattern recognition, there can be several pattern recognition strategies. Each strategy makes different assumptions regarding the fidelity of each sensor source and uses different data to arrive at its estimates. Each strategy may place different trust in a sensor at different times, and each may be better in different situations. In this paper, we introduce a novel algorithm for combining the advice of the experts to achieve robust pattern recognition performance. We show that with high probability the algorithm seeks out the advice of the experts from decision relevant information sources for making optimal prediction. Finally, we provide experimental results using face and infrared image data that corroborate our theoretical analysis.

[1]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[2]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[3]  Josef Kittler,et al.  A Framework for Classifier Fusion: Is It Still Needed? , 2000, SSPR/SPR.

[4]  Tom Fawcett,et al.  Technical Note: PAV and the ROC Convex Hull , 2007 .

[5]  Cynthia Rudin,et al.  Precise Statements of Convergence for AdaBoost and arc-gv , 2007 .

[6]  Fei Wang,et al.  Multi-View Local Learning , 2008, AAAI.

[7]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[8]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[9]  Y. Freund,et al.  The non-stochastic multi-armed bandit problem , 2001 .

[10]  Csaba Szepesvári,et al.  Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..

[11]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[12]  Naonori Ueda,et al.  Optimal Linear Combination of Neural Networks for Improving Classification Performance , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Markus Püschel,et al.  Bandit-based optimization on graphs with application to library performance tuning , 2009, ICML '09.

[14]  Arun Ross,et al.  Multimodal biometrics: An overview , 2004, 2004 12th European Signal Processing Conference.

[15]  Rocco A. Servedio,et al.  Smooth boosting and learning with malicious noise , 2003 .

[16]  Robert P. W. Duin,et al.  Is independence good for combining classifiers? , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[17]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[18]  Zhi-Hua Zhou,et al.  On multi-view active learning and the combination with semi-supervised learning , 2008, ICML '08.

[19]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[20]  Tom Fawcett,et al.  PAV and the ROC convex hull , 2007, Machine Learning.

[21]  Sherif Hashem,et al.  Optimal Linear Combinations of Neural Networks , 1997, Neural Networks.

[22]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[23]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[24]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[25]  Osamu Watanabe,et al.  MadaBoost: A Modification of AdaBoost , 2000, COLT.

[26]  B. Kégl,et al.  Fast boosting using adversarial bandits , 2010, ICML.

[27]  Michèle Sebag,et al.  Extreme compass and Dynamic Multi-Armed Bandits for Adaptive Operator Selection , 2009, 2009 IEEE Congress on Evolutionary Computation.

[28]  Ayhan Demiriz,et al.  Linear Programming Boosting via Column Generation , 2002, Machine Learning.

[29]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Balázs Kégl,et al.  Accelerating AdaBoost using UCB , 2009, KDD Cup.

[31]  R. Bharat Rao,et al.  Bayesian Co-Training , 2007, J. Mach. Learn. Res..

[32]  Nello Cristianini,et al.  Kernel-Based Data Fusion and Its Application to Protein Function Prediction in Yeast , 2003, Pacific Symposium on Biocomputing.

[33]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[34]  Yizhou Sun,et al.  Heterogeneous source consensus learning via decision propagation and negotiation , 2009, KDD.

[35]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[36]  Luo Si,et al.  A New Boosting Algorithm Using Input-Dependent Regularizer , 2003, ICML 2003.

[37]  Zhi-Hua Zhou,et al.  A New Analysis of Co-Training , 2010, ICML.

[38]  Ludmila I. Kuncheva,et al.  A Theoretical Study on Six Classifier Fusion Strategies , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[40]  James C. Bezdek,et al.  Decision templates for multiple classifier fusion: an experimental comparison , 2001, Pattern Recognit..

[41]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[42]  Paul A. Viola,et al.  Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade , 2001, NIPS.

[43]  Josef Kittler,et al.  Combining classifiers: A theoretical framework , 1998, Pattern Analysis and Applications.