Biogeography-based rule mining for classification

Rule-based classification is a popular approach for solving real world classification problems. Once suitable rules have been obtained, rule-based classifiers are easy to deploy and explain. In this paper, we describe an approach that uses biogeography-based optimization (BBO) to compute rule sets that maximize predictive accuracy. BBO is an evolutionary algorithm inspired by the migration patterns of species between the islands of an archipelago. In our implementation, each species corresponds to a classification rule, each island is occupied by multiple species and corresponds to a classifier, and the fitness of an island is computed as the predictive classification accuracy of the corresponding classifier. The archipelago evolves via mutation, selection, and migration of species between islands. Successful islands have a decreased immigration rate and an increased emigration rate. In general, such islands tend to resist invasion and to colonize less successful islands. This results in an evolving set of habitats that corresponds to a population of classifiers. We demonstrate the effectiveness of our approach by comparing it to several traditional and evolutionary based state-of-the-art classifiers.

[1]  John J. Grefenstette,et al.  Genetic algorithms and machine learning , 1993, COLT '93.

[2]  Cezary Z. Janikow Inductive learning of decision rules from attribute-based examples: a knowledge-intensive genetic algorithm approach , 1992 .

[3]  Stewart W. Wilson Get Real! XCS with Continuous-Valued Inputs , 1999, Learning Classifier Systems.

[4]  D. Bhugra,et al.  Association rule analysis using biogeography based optimization , 2013, 2013 International Conference on Computer Communication and Informatics.

[5]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[6]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[7]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1996, Springer Berlin Heidelberg.

[8]  Dan Simon,et al.  Biogeography-based optimization combined with evolutionary strategy and immigration refusal , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[9]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[10]  Kay Chen Tan,et al.  A coevolutionary algorithm for rules discovery in data mining , 2006, Int. J. Syst. Sci..

[11]  Steven Guan,et al.  Ordered incremental training with genetic algorithms , 2004, Int. J. Intell. Syst..

[12]  Hisao Ishibuchi,et al.  Three-objective genetics-based machine learning for linguistic rule extraction , 2001, Inf. Sci..

[13]  P. K. Chattopadhyay,et al.  Hybrid Differential Evolution With Biogeography-Based Optimization for Solution of Economic Load Dispatch , 2010, IEEE Transactions on Power Systems.

[14]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[15]  Dan Simon,et al.  Hybrid biogeography-based evolutionary algorithms , 2014, Eng. Appl. Artif. Intell..

[16]  O. J. Dunn Multiple Comparisons among Means , 1961 .

[17]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[18]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[19]  Santanu Chaudhury,et al.  Tissue classification in magnetic resonance images through the hybrid approach of Michigan and Pittsburg genetic algorithm , 2011, Appl. Soft Comput..

[20]  Dan Simon,et al.  Biogeography-Based Optimization , 2022 .

[21]  Patrick Siarry,et al.  Two-stage update biogeography-based optimization using differential evolution algorithm (DBBO) , 2011, Comput. Oper. Res..

[22]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[23]  Zbigniew Michalewicz,et al.  Genetic algorithms + data structures = evolution programs (3rd ed.) , 1996 .

[24]  Steven Guan,et al.  An incremental approach to genetic-algorithms-based classification , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25]  Jason H. Moore,et al.  Learning classifier systems: a complete introduction, review, and roadmap , 2009 .

[26]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[27]  Francisco Herrera,et al.  Genetics-Based Machine Learning for Rule Induction: State of the Art, Taxonomy, and Comparative Study , 2010, IEEE Transactions on Evolutionary Computation.

[28]  Francisco Herrera,et al.  A Survey on the Application of Genetic Programming to Classification , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[29]  Dan Simon,et al.  Hybrid invasive weed/biogeography-based optimization , 2017, Eng. Appl. Artif. Intell..

[30]  Edmund K. Burke,et al.  Improving the scalability of rule-based evolutionary learning , 2009, Memetic Comput..

[31]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[32]  Haiping Ma,et al.  An analysis of the equilibrium of migration models for biogeography-based optimization , 2010, Inf. Sci..

[33]  Jaume Bacardit Peñarroya Pittsburgh genetic-based machine learning in the data mining era: representations, generalization, and run-time , 2004 .

[34]  Wlodzislaw Duch,et al.  Optimization and Interpretation of Rule-based Classifiers , 2000, Intelligent Information Systems.

[35]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[36]  Xavier Llorà,et al.  Observer-invariant histopathology using genetics-based machine learning , 2009, Natural Computing.

[37]  Kenneth A. De Jong,et al.  Learning Concept Classification Rules Using Genetic Algorithms , 1991, IJCAI.

[38]  Jesús S. Aguilar-Ruiz,et al.  Natural Encoding for Evolutionary Supervised Learning , 2007, IEEE Transactions on Evolutionary Computation.

[39]  Tae-Sun Choi,et al.  Genetic programming-based feature transform and classification for the automatic detection of pulmonary nodules on computed tomography images , 2012, Inf. Sci..

[40]  Mostafa Zandieh,et al.  A new biogeography-based optimization (BBO) algorithm for the flexible job shop scheduling problem , 2012 .

[41]  Dr. Alex A. Freitas Data Mining and Knowledge Discovery with Evolutionary Algorithms , 2002, Natural Computing Series.

[42]  Bart Baesens,et al.  Comprehensible Credit Scoring Models Using Rule Extraction from Support Vector Machines , 2007, Eur. J. Oper. Res..

[43]  Kenneth DeJong,et al.  Inductive Learning of Decision Rules from Attribute-Based Examples : A Knowledge-Intensive Genetic Algorithm Approach , 2010 .

[44]  Yen-Liang Chen,et al.  Mining associative classification rules with stock trading data - A GA-based method , 2010, Knowl. Based Syst..

[45]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[46]  Data, documentation, and decision tables , 1966, CACM.

[47]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[48]  Sebastián Ventura,et al.  G3P-MI: A genetic programming algorithm for multiple instance learning , 2010, Inf. Sci..

[49]  Kerstin Vogler,et al.  Applications Of Multi Objective Evolutionary Algorithms , 2016 .

[50]  Antonio González Muñoz,et al.  Table Ii Tc Pattern Recognition Result for 120 Eir Satellite Image Cases Selection of Relevant Features in a Fuzzy Genetic Learning Algorithm , 2001 .

[51]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.