A novel ensemble algorithm for biomedical classification based on Ant Colony Optimization

Abstract: One of the major tasks in biomedicine is the classification and prediction of biomedical data. Ensemble learning is an effective method to significantly improve the generalization ability of classification and thus have obtained more and more attentions in the biomedicine community. However, most existing techniques in ensemble learning employ all the trained component classifiers to constitute ensembles, which are sometimes unnecessarily large and can lead to extra memory costs and computational times. For improving the generalization ability and efficiency of ensemble for biomedical classification, an Ant Colony Optimization and rough set based ensemble approach is proposed in this paper. Ant Colony Optimization and rough set theory are incorporated to select a subset of all the trained component classifiers for aggregation. Experiment results show that compared with existing methods, it not only decreases the size of ensemble, but also obtains higher prediction performance.

[1]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[2]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[3]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[4]  William F. Punch,et al.  Optimizing Classification Ensembles via a Genetic Algorithm for a Web-Based Educational System , 2004, SSPR/SPR.

[5]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Zuren Feng,et al.  An efficient ant colony optimization approach to attribute reduction in rough set theory , 2008, Pattern Recognit. Lett..

[7]  Saharon Rosset,et al.  Model selection via the AUC , 2004, ICML.

[8]  Johan A. K. Suykens,et al.  Systematic benchmarking of microarray data classification: assessing the role of non-linearity and dimensionality reduction , 2004, Bioinform..

[9]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[10]  Tianzi Jiang,et al.  A combinational feature selection and ensemble neural network method for classification of gene expression data , 2004, BMC Bioinformatics.

[11]  Yiming Yang,et al.  An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[12]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[13]  Aik Choon Tan,et al.  Ensemble machine learning on gene expression data for cancer classification. , 2003, Applied bioinformatics.

[14]  Lei Xi,et al.  Rough set and ensemble learning based semi-supervised algorithm for text classification , 2011, Expert Syst. Appl..

[15]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[16]  G. Di Caro,et al.  Ant colony optimization: a new meta-heuristic , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[17]  Mark Johnston,et al.  Particle swarm optimization based multi-prototype ensembles , 2009, GECCO.

[18]  Jon Atli Benediktsson,et al.  Proceedings of the 8th International Workshop on Multiple Classifier Systems , 2009, International Workshop on Multiple Classifier Systems.

[19]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[20]  Shandar Ahmad,et al.  ASAView: Database and tool for solvent accessibility representation in proteins , 2003, BMC Bioinformatics.

[21]  David J. Hand,et al.  Construction and Assessment of Classification Rules , 1997 .

[22]  Marco Dorigo,et al.  Distributed Optimization by Ant Colonies , 1992 .

[23]  Andrzej Skowron,et al.  Rough set methods in feature selection and recognition , 2003, Pattern Recognit. Lett..

[24]  Nasser Ghasem-Aghaee,et al.  Using Ant Colony Optimization-Based Selected Features for Predicting Post-synaptic Activity in Proteins , 2008, EvoBIO.

[25]  Lakhmi C. Jain,et al.  Designing classifier fusion systems by genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[26]  Jae Won Lee,et al.  An extensive comparison of recent classification tools applied to microarray data , 2004, Comput. Stat. Data Anal..

[27]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[28]  Nasser Ghasem-Aghaee,et al.  Text feature selection using ant colony optimization , 2009, Expert Syst. Appl..

[29]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[30]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[31]  Marco Dorigo,et al.  Swarm intelligence: from natural to artificial systems , 1999 .

[32]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.