Decimated input ensembles for improved generalization

Using an ensemble of classifiers instead of a single classifier has been demonstrated to improve generalization performance in many difficult problems. However, for this improvement to take place it is necessary to make the classifiers in an ensemble more complementary. In this paper, we highlight the need to reduce the correlation among the component classifiers and investigate one method for correlation reduction: input decimation. We elaborate on input decimation, a method that uses the discriminating features of the inputs to decouple classifiers. By presenting different parts of the feature set to each individual classifier, input decimation generates a diverse pool of classifiers. Experimental results confirm that input decimation combination improves the generalization performance.

[1]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[2]  Bruce W. Suter,et al.  The multilayer perceptron as an approximation to a Bayes optimal discriminant function , 1990, IEEE Trans. Neural Networks.

[3]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[4]  Esther Levin,et al.  A statistical approach to learning and generalization in layered neural networks , 1989, COLT '89.

[5]  Kamal A. Ali,et al.  On the Link between Error Correlation and Error Reduction in Decision Tree Ensembles , 1995 .

[6]  Kagan Tumer,et al.  Linear and Order Statistics Combiners for Pattern Classification , 1999, ArXiv.

[7]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Richard Lippmann,et al.  Neural Network Classifiers Estimate Bayesian a posteriori Probabilities , 1991, Neural Computation.

[9]  Robert A. Jacobs,et al.  Methods For Combining Experts' Probability Assessments , 1995, Neural Computation.

[10]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[11]  Erkki Oja,et al.  Subspace methods of pattern recognition , 1983 .

[12]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[13]  Erkki Oja,et al.  Principal components, minor components, and linear neural networks , 1992, Neural Networks.

[14]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[15]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .

[16]  Jude W. Shavlik,et al.  Training Knowledge-Based Neural Networks to Recognize Genes , 1990, NIPS.

[17]  Lutz Prechelt,et al.  PROBEN 1 - a set of benchmarks and benchmarking rules for neural network training algorithms , 1994 .

[18]  Arun D Kulkarni,et al.  Neural Networks for Pattern Recognition , 1991 .

[19]  Kagan Tumer,et al.  Error Correlation and Error Reduction in Ensemble Classifiers , 1996, Connect. Sci..

[20]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[21]  Robert P. W. Duin,et al.  Sammon's mapping using neural networks: A comparison , 1997, Pattern Recognit. Lett..

[22]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[23]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[24]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[25]  Nanda Kambhatla,et al.  Dimension Reduction by Local Principal Component Analysis , 1997, Neural Computation.

[26]  Esther Levin,et al.  A statistical approach to learning and generalization in layered neural networks , 1989, Proc. IEEE.

[27]  Kagan Tumer,et al.  Analysis of decision boundaries in linearly combined neural classifiers , 1996, Pattern Recognit..

[28]  Nanda Kambhatla,et al.  Fast Non-Linear Dimension Reduction , 1993, NIPS.

[29]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[30]  Harris Drucker,et al.  Boosting and Other Ensemble Methods , 1994, Neural Computation.

[31]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[32]  Kevin J. Cherkauer Human Expert-level Performance on a Scientiic Image Analysis Task by a System Using Combined Artiicial Neural Networks , 1996 .