Combining Classifiers Using Correspondence Analysis

Several effective methods for improving the performance of a single learning algorithm have been developed recently. The general approach is to create a set of learned models by repeatedly applying the algorithm to different versions of the training data, and then combine the learned models' predictions according to a prescribed voting scheme. Little work has been done in combining the predictions of a collection of models generated by many learning algorithms having different representation and/or search strategies. This paper describes a method which uses the strategies of stacking and correspondence analysis to model the relationship between the learning examples and the way in which they are classified by a collection of learned models. A nearest neighbor method is then applied within the resulting representation to classify previously unseen examples. The new algorithm consistently performs as well or better than other combining techniques on a suite of data sets.

[1]  Michael J. Pazzani,et al.  Combining Neural Network Regression Estimates with Regularized Linear Weights , 1996, NIPS.

[2]  Michael Perrone,et al.  Putting It All Together: Methods for Combining Neural Networks , 1993, NIPS.

[3]  Volker Tresp,et al.  Combining Estimators Using Non-Constant Weighting Functions , 1994, NIPS.

[4]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[5]  R. Tibshirani,et al.  Combining Estimates in Regression and Classification , 1996 .

[6]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[7]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[8]  R. Clarke,et al.  Theory and Applications of Correspondence Analysis , 1985 .

[9]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[10]  David W. Opitz,et al.  Generating Accurate and Diverse Members of a Neural-Network Ensemble , 1995, NIPS.

[11]  Thomas G. Dietterich,et al.  Error-Correcting Output Coding Corrects Bias and Variance , 1995, ICML.

[12]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[13]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[14]  Ron Meir,et al.  Bias, Variance and the Combination of Least Squares Estimators , 1994, NIPS.

[15]  Simon Kasif,et al.  OC1: A Randomized Induction of Oblique Decision Trees , 1993, AAAI.

[16]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .