Local Classifier Weighting by Quadratic Programming

It has been widely accepted that the classification accuracy can be improved by combining outputs of multiple classifiers. However, how to combine multiple classifiers with various (potentially conflicting) decisions is still an open problem. A rich collection of classifier combination procedures-many of which are heuristic in nature-have been developed for this goal. In this brief, we describe a dynamic approach to combine classifiers that have expertise in different regions of the input space. To this end, we use local classifier accuracy estimates to weight classifier outputs. Specifically, we estimate local recognition accuracies of classifiers near a query sample by utilizing its nearest neighbors, and then use these estimates to find the best weights of classifiers to label the query. The problem is formulated as a convex quadratic optimization problem, which returns optimal nonnegative classifier weights with respect to the chosen objective function, and the weights ensure that locally most accurate classifiers are weighted more heavily for labeling the query sample. Experimental results on several data sets indicate that the proposed weighting scheme outperforms other popular classifier combination schemes, particularly on problems with complex decision boundaries. Hence, the results indicate that local classification-accuracy-based combination techniques are well suited for decision making when the classifiers are trained by focusing on different regions of the input space.

[1]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[2]  Ludmila I. Kuncheva,et al.  Switching between selection and fusion in combining classifiers: an experiment , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[3]  Josef Kittler,et al.  Combining multiple classifiers by averaging or by multiplying? , 2000, Pattern Recognit..

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[5]  James C. Bezdek,et al.  Decision templates for multiple classifier fusion: an experimental comparison , 2001, Pattern Recognit..

[6]  Gert Vegter,et al.  In handbook of discrete and computational geometry , 1997 .

[7]  Josef Kittler,et al.  Multiple Classifier Fusion in Probabilistic Neural Networks , 2002, Pattern Analysis & Applications.

[8]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[9]  Fabio Roli,et al.  A theoretical and experimental analysis of linear combiners for multiple classifier systems , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[12]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Piotr Indyk,et al.  Nearest Neighbors in High-Dimensional Spaces , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[14]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[15]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[16]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[17]  Robi Polikar,et al.  An ensemble based data fusion approach for early diagnosis of Alzheimer's disease , 2008, Inf. Fusion.

[18]  Ludmila I. Kuncheva,et al.  A Theoretical Study on Six Classifier Fusion Strategies , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Galina L. Rogova,et al.  Combining the results of several neural network classifiers , 1994, Neural Networks.

[20]  Kagan Tumer,et al.  Analysis of decision boundaries in linearly combined neural classifiers , 1996, Pattern Recognit..

[21]  Yi Lu,et al.  Knowledge integration in a multiple classifier system , 2004, Applied Intelligence.

[22]  Volker Tresp,et al.  Combining Estimators Using Non-Constant Weighting Functions , 1994, NIPS.

[23]  Bernhard Schölkopf,et al.  Comparing support vector machines with Gaussian kernels to radial basis function classifiers , 1997, IEEE Trans. Signal Process..

[24]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[25]  Naonori Ueda,et al.  Optimal Linear Combination of Neural Networks for Improving Classification Performance , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Kevin W. Bowyer,et al.  Combination of multiple classifiers using local accuracy estimates , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.