Classifier ensembles: Select real-world applications

Broad classes of statistical classification algorithms have been developed and applied successfully to a wide range of real-world domains. In general, ensuring that the particular classification algorithm matches the properties of the data is crucial in providing results that meet the needs of the particular application domain. One way in which the impact of this algorithm/application match can be alleviated is by using ensembles of classifiers, where a variety of classifiers (either different types of classifiers or different instantiations of the same classifier) are pooled before a final classification decision is made. Intuitively, classifier ensembles allow the different needs of a difficult problem to be handled by classifiers suited to those particular needs. Mathematically, classifier ensembles provide an extra degree of freedom in the classical bias/variance tradeoff, allowing solutions that would be difficult (if not impossible) to reach with only a single classifier. Because of these advantages, classifier ensembles have been applied to many difficult real-world problems. In this paper, we survey select applications of ensemble methods to problems that have historically been most representative of the difficulties in classification. In particular, we survey applications of ensemble methods to remote sensing, person recognition, one vs. all recognition, and medicine.

[1]  Geoffrey E. Hinton,et al.  Learning Generative Texture Models with extended Fields-of-Experts , 2009, BMVC.

[2]  Steve McLaughlin,et al.  Comparative study of textural analysis techniques to characterise tissue from intravascular ultrasound , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[3]  J. Mesirov,et al.  Hybrid system for protein secondary structure prediction. , 1992, Journal of molecular biology.

[4]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .

[5]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  C. Granger Invited review combining forecasts—twenty years later , 1989 .

[7]  Peter J. Denning,et al.  Beyond calculation - the next fifty years of computing , 1997 .

[8]  Dennis H. Rouvray,et al.  Beyond calculation: The next fifty years of computing , 1997 .

[9]  Erinija Pranckeviciene,et al.  Using Domain Knowledge for in the Random Subspace Method: Application: Application to the Classification of Biomedical Spectra , 2005, Multiple Classifier Systems.

[10]  Luigi P. Cordella,et al.  Network Intrusion Detection by a Multi-stage Classification System , 2004, Multiple Classifier Systems.

[11]  Harris Drucker,et al.  Improving Performance in Neural Networks Using a Boosting Algorithm , 1992, NIPS.

[12]  Kagan Tumer,et al.  Input decimated ensembles , 2003, Pattern Analysis & Applications.

[13]  A. Sharkey Linear and Order Statistics Combiners for Pattern Classification , 1999 .

[14]  Fabio Roli,et al.  Ensembles of Neural Networks for Soft Classification of Remote Sensing Images , 1997 .

[15]  Mark S. Nixon,et al.  Comparing Different Template Features for Recognizing People by their Gait , 1998, BMVC.

[16]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[17]  David G. Stork,et al.  Pattern Classification , 1973 .

[18]  Kagan Tumer,et al.  Efficient agent-based cluster ensembles , 2006, AAMAS '06.

[19]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[20]  Seymour Shlien,et al.  Multiple binary decision tree classifiers , 1990, Pattern Recognit..

[21]  J.B.D. Cabrera,et al.  Infrastructures and algorithms for distributed anomaly-based intrusion detection in mobile ad-hoc networks , 2005, MILCOM 2005 - 2005 IEEE Military Communications Conference.

[22]  L. Breiman Stacked Regressions , 1996, Machine Learning.

[23]  Joydeep Ghosh,et al.  Exploiting Class Hierarchies for Knowledge Transfer in Hyperspectral Data , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[24]  J. Little,et al.  Recognizing People by Their Gait: The Shape of Motion , 1998 .

[25]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[26]  R.P. Lippmann,et al.  Pattern classification using neural networks , 1989, IEEE Communications Magazine.

[27]  Samy Bengio,et al.  EER of Fixed and Trainable Fusion Classifiers: A Theoretical Study with Application to Biometric Authentication Tasks , 2005, Multiple Classifier Systems.

[28]  L. Cooper,et al.  When Networks Disagree: Ensemble Methods for Hybrid Neural Networks , 1992 .

[29]  P. Swain,et al.  Neural Network Approaches Versus Statistical Methods In Classification Of Multisource Remote Sensing Data , 1990 .

[30]  Roberto Battiti,et al.  Democracy in neural nets: Voting schemes for classification , 1994, Neural Networks.

[31]  Kagan Tumer,et al.  Robust Order Statistics Based Ensembles for Distributed Data Mining , 2001 .

[32]  Kagan Tumer,et al.  Order Statistics Combiners for Neural Classifiers 1 , 1995 .

[33]  Lorenzo Bruzzone,et al.  Feature selection for remote-sensing data classification , 1994, Remote Sensing.

[34]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[35]  Joydeep Ghosh,et al.  Multiclassifier Systems: Back to the Future , 2002, Multiple Classifier Systems.

[36]  Hakan Erdogan,et al.  Multi-modal Person Recognition for Vehicular Applications , 2005, Multiple Classifier Systems.

[37]  S. Ekins,et al.  Progress in predicting human ADME parameters in silico. , 2000, Journal of pharmacological and toxicological methods.

[38]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[39]  Jon Atli Benediktsson,et al.  Proceedings of the 8th International Workshop on Multiple Classifier Systems , 2009, International Workshop on Multiple Classifier Systems.

[40]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[41]  Bruce W. Suter,et al.  The multilayer perceptron as an approximation to a Bayes optimal discriminant function , 1990, IEEE Trans. Neural Networks.

[42]  K. Bowyer,et al.  Assessment of Time Dependency in Face Recognition , 2003 .

[43]  Joydeep Ghosh,et al.  Cluster Ensembles A Knowledge Reuse Framework for Combining Partitionings , 2002, AAAI/IAAI.

[44]  Jenq-Neng Hwang,et al.  Integration of neural networks and decision tree classifiers for automated cytology screening , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[45]  Julian Fiérrez,et al.  Bayesian adaptation for user-dependent multimodal biometric authentication , 2005, Pattern Recognit..

[46]  Kagan Tumer,et al.  Input Decimation Ensembles: Decorrelation through Dimensionality Reduction , 2001, Multiple Classifier Systems.

[47]  Kagan Tumer,et al.  Robust Combining of Disparate Classifiers through Order Statistics , 1999, Pattern Analysis & Applications.

[48]  Paul W. Munro,et al.  Reducing Variance of Committee Prediction with Resampling Techniques , 1996, Connect. Sci..

[49]  Wenke Lee,et al.  A cooperative intrusion detection system for ad hoc networks , 2003, SASN '03.

[50]  Randy L. Shimabukuro,et al.  Least-Squares Learning and Approximation of Posterior Probabilities on Classification Problems by Neural Network Models , 1991 .

[51]  Julian Fiérrez,et al.  Exploiting general knowledge in user-dependent fusion strategies for multimodal biometric verification , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[52]  Raman K. Mehra,et al.  Ensemble methods for anomaly detection and distributed intrusion detection in Mobile Ad-Hoc Networks , 2008, Inf. Fusion.

[53]  Nitesh V. Chawla,et al.  Designing Multiple Classifier Systems for Face Recognition , 2005, Multiple Classifier Systems.

[54]  Julian Fiérrez,et al.  Speaker Verification Using Adapted User-Dependent Multilevel Fusion , 2005, Multiple Classifier Systems.

[55]  Robert A. Jacobs,et al.  Methods For Combining Experts' Probability Assessments , 1995, Neural Computation.

[56]  Kazuya Takeda,et al.  Construction and Analysis of a Multi-Layered In-car Spoken Dialogue Corpus , 2005 .

[57]  Julian Fiérrez,et al.  Fusion strategies in multimodal biometric verification , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[58]  Ashok Samal,et al.  Automatic recognition and analysis of human faces and facial expressions: a survey , 1992, Pattern Recognit..

[59]  Juha Röning,et al.  Methods for person identification on a pressure-sensitive floor: Experiments with multiple classifiers and reject option , 2008, Inf. Fusion.

[60]  Paul M. B. Vitányi,et al.  Proceedings of the Second European Conference on Computational Learning Theory , 1995 .

[61]  R L Somorjai,et al.  Near‐optimal region selection for feature space reduction: novel preprocessing methods for classifying MR spectra , 1998, NMR in biomedicine.

[62]  Jon Atli Benediktsson,et al.  Decision Fusion for the Classification of Urban Remote Sensing Images , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[63]  Kagan Tumer,et al.  Analysis of decision boundaries in linearly combined neural classifiers , 1996, Pattern Recognit..

[64]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[65]  Xudong Jiang,et al.  Exploiting global and local decisions for multimodal biometrics verification , 2004, IEEE Transactions on Signal Processing.

[66]  Sherif Hashem Bruce Schmeiser Approximating a Function and its Derivatives Using MSE-Optimal Linear Combinations of Trained Feedfo , 1993 .

[67]  David W. Opitz,et al.  Generating Accurate and Diverse Members of a Neural-Network Ensemble , 1995, NIPS.

[68]  Salvatore J. Stolfo,et al.  On the Accuracy of Meta-learning for Scalable Data Mining , 2004, Journal of Intelligent Information Systems.

[69]  Shashi Shekhar,et al.  Multilevel hypergraph partitioning: application in VLSI domain , 1997, DAC.

[70]  Lorenzo Bruzzone,et al.  Image and Signal Processing for Remote Sensing IX: 9-12 September 2003, Barcelona, Spain , 2004 .

[71]  Fabio Roli,et al.  Analysis of Linear and Order Statistics Combiners for Fusion of Imbalanced Classifiers , 2002, Multiple Classifier Systems.

[72]  Ting Wang,et al.  Application of Breiman's Random Forest to Modeling Structure-Activity Relationships of Pharmaceutical Molecules , 2004, Multiple Classifier Systems.

[73]  Peng Li,et al.  An Abnormal ECG Beat Detection Approach for Long-Term Monitoring of Heart Patients Based on Hybrid Kernel Machine Ensemble , 2005, Multiple Classifier Systems.

[74]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[75]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[76]  Sushil Jajodia,et al.  Applications of Data Mining in Computer Security , 2002, Advances in Information Security.

[77]  Shashi Shekhar,et al.  Multilevel hypergraph partitioning: applications in VLSI domain , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[78]  Salvatore J. Stolfo,et al.  A Comparative Evaluation of Voting and Meta-learning on Partitioned Data , 1995, ICML.

[79]  J. Gibrat,et al.  Secondary structure prediction: combination of three different methods. , 1988, Protein engineering.

[80]  Galina L. Rogova,et al.  Combining the results of several neural network classifiers , 1994, Neural Networks.

[81]  Douglas M. Hawkins,et al.  QSAR with Few Compounds and Many Features , 2001, J. Chem. Inf. Comput. Sci..

[82]  Ajay Kumar,et al.  Integrating palmprint with face for user authentication , 2003 .

[83]  M. Weiser,et al.  THE COMING AGE OF CALM TECHNOLOGY[1] , 1996 .

[84]  A. N. Rajagopalan,et al.  Gait-based recognition of humans using continuous HMMs , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[85]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[86]  Joydeep Ghosh,et al.  Noise sensitivity of static neural network classifiers , 1992, Defense, Security, and Sensing.

[87]  Alexander H. Waibel,et al.  The Meta-Pi Network: Building Distributed Knowledge Representations for Robust Multisource Pattern Recognition , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[88]  Inderjit S. Dhillon,et al.  Concept Decompositions for Large Sparse Text Data Using Clustering , 2004, Machine Learning.

[89]  Michael J. Pazzani,et al.  Combining Neural Network Regression Estimates with Regularized Linear Weights , 1996, NIPS.

[90]  Junhyong Kim,et al.  Tutorial on Phylogenetic Tree Estimation , 1999, ISMB 1999.

[91]  David H. Wolpert,et al.  On Bias Plus Variance , 1997, Neural Computation.

[92]  Ganesh Mani Lowering Variance of Decisions by Using Artificial Neural Network Portfolios , 1991, Neural Computation.

[93]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[94]  Eleazar Eskin,et al.  A GEOMETRIC FRAMEWORK FOR UNSUPERVISED ANOMALY DETECTION: DETECTING INTRUSIONS IN UNLABELED DATA , 2002 .

[95]  Anil K. Jain,et al.  Large-scale evaluation of multimodal biometric authentication using state-of-the-art systems , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[96]  Johannes R. Sveinsson,et al.  Random Forests for land cover classification , 2006, Pattern Recognit. Lett..

[97]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[98]  M. Singh,et al.  An Evidential Reasoning Approach for Multiple-Attribute Decision Making with Uncertainty , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[99]  Amanda J. C. Sharkey,et al.  Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems , 1999 .

[100]  Arun Ross,et al.  Learning user-specific parameters in a multibiometric system , 2002, Proceedings. International Conference on Image Processing.

[101]  S. Griffis EDITOR , 1997, Journal of Navigation.

[102]  Kagan Tumer,et al.  Error Correlation and Error Reduction in Ensemble Classifiers , 1996, Connect. Sci..

[103]  Bhagavatula Vijaya Kumar,et al.  Learning ranks with neural networks , 1995, SPIE Defense + Commercial Sensing.

[104]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[105]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[106]  William G. Baxt,et al.  Improving the Accuracy of an Artificial Neural Network Using Multiple Differently Trained Networks , 1992, Neural Computation.

[107]  Kagan Tumer,et al.  Linear and order statistics combiners for reliable pattern classification , 1996 .

[108]  Fabio Roli,et al.  Intrusion detection in computer networks by a modular ensemble of one-class classifiers , 2008, Inf. Fusion.

[109]  Leonid Portnoy,et al.  Intrusion detection with unlabeled data using clustering , 2000 .

[110]  Shixin Yu,et al.  Feature Selection and Classifier Ensembles: A Study on Hyperspectral Remote Sensing Data , 2003 .

[111]  Kevin Knight,et al.  Artificial intelligence (2. ed.) , 1991 .

[112]  R. Tibshirani,et al.  Combining Estimates in Regression and Classification , 1996 .

[113]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[114]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[115]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[116]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[117]  Zakaria Maamar,et al.  Proceedings of the Workshop on Service-Oriented Computing and Agent Based Engineering (SOCABE 2006) held in conjunction with the 5th International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2006), Hakodate, Japan, 08-12 May 2006 , 2006 .

[118]  Philip K. Chan,et al.  Advances in Distributed and Parallel Knowledge Discovery , 2000 .

[119]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[120]  Josef Skrzypek,et al.  Synergy of Clustering Multiple Back Propagation Networks , 1989, NIPS.

[121]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[122]  Richard Lippmann,et al.  Neural Network Classifiers Estimate Bayesian a posteriori Probabilities , 1991, Neural Computation.

[123]  Johannes R. Sveinsson,et al.  Parallel consensual neural networks , 1997, IEEE Trans. Neural Networks.

[124]  Rama Chellappa,et al.  Human and machine recognition of faces: a survey , 1995, Proc. IEEE.

[125]  Christopher Leckie,et al.  Unsupervised Anomaly Detection in Network Intrusion Detection Using Clusters , 2005, ACSC.

[126]  Samy Bengio,et al.  An Investigation of F-ratio Client-Dependent Normalisation on Biometric Authentication Tasks , 2004 .

[127]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[128]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[129]  Julian Fiérrez,et al.  A Comparative Evaluation of Fusion Strategies for Multimodal Biometric Verification , 2003, AVBPA.

[130]  Nils J. Nilsson,et al.  Learning Machines: Foundations of Trainable Pattern-Classifying Systems , 1965 .

[131]  Samy Bengio,et al.  F-ratio client dependent normalisation for biometric authentication tasks , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[132]  Ludmila I. Kuncheva,et al.  A Theoretical Study on Six Classifier Fusion Strategies , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[133]  Wenke Lee,et al.  Intrusion Detection Techniques for Mobile Wireless Networks , 2003, Wirel. Networks.

[134]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[135]  Joydeep Ghosh,et al.  Hierarchical Fusion of Multiple Classifiers for Hyperspectral Data Analysis , 2002, Pattern Analysis & Applications.

[136]  A. E. Sarhan,et al.  Estimation of Location and Scale Parameters by Order Statistics from Singly and Doubly Censored Samples , 1956 .

[137]  Harris Drucker,et al.  Boosting and Other Ensemble Methods , 1994, Neural Computation.