An evolutionary framework for machine learning applied to medical data

Abstract Supervised learning problems can be faced by using a wide variety of approaches supported in machine learning. In recent years there has been an increasing interest in using the evolutionary computation paradigm as a search method for classifiers, helping the applied machine learning technique. In this context, the knowledge representation in the form of logical rules has been one of the most accepted machine learning approaches, because of its level of expressiveness. This paper proposes an evolutionary framework for rule-based classifier induction. Our proposal introduces genetic programming to build a search method for classification-rules (IF/THEN). From this approach, we deal with problems such as, maximum rule length and rule intersection. The experiments have been carried out on our domain of interest, medical data. The achieved results define a methodology to follow in the learning method evaluation for knowledge discovery from medical data. Moreover, the results compared to other methods have shown that our proposal can be very useful in data analysis and classification coming from the medical domain.

[1]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[2]  James R. Nolan,et al.  Computer Systems That Learn: an Empirical Study of the Effect of Noise on the Performance of Three Classification Methods Computer Systems That Learn: an Empirical Study of the Effect of Noise on the Performance of Three Classification Methods , 2022 .

[3]  Guido Cervone,et al.  Algorithm quasi‐optimal (AQ) learning , 2010 .

[4]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[5]  Jing Liu,et al.  An organizational coevolutionary algorithm for classification , 2006, IEEE Trans. Evol. Comput..

[6]  Ester Bernadó-Mansilla,et al.  Accuracy-Based Learning Classifier Systems: Models, Analysis and Applications to Classification Tasks , 2003, Evolutionary Computation.

[7]  Ujjwal Maulik,et al.  Multiobjective Genetic Algorithms for Clustering - Applications in Data Mining and Bioinformatics , 2011 .

[8]  Francisco Herrera,et al.  A Survey on the Application of Genetic Programming to Classification , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[9]  Grant Dick,et al.  Improving Geometric Semantic Genetic Programming with Safe Tree Initialisation , 2015, EuroGP.

[10]  Lior Rokach,et al.  Soft Computing for Knowledge Discovery and Data Mining , 2007 .

[11]  Ivanoe De Falco,et al.  Discovering interesting classification rules with genetic programming , 2002, Appl. Soft Comput..

[12]  Peter J. F. Lucas,et al.  Analysis of Notions of Diagnosis , 1998, Artif. Intell..

[13]  Hamido Fujita,et al.  Multi-Imbalance: An open-source software for multi-class imbalance learning , 2019, Knowl. Based Syst..

[14]  Alex Alves Freitas,et al.  Evolving rule induction algorithms with multi-objective grammar-based genetic programming , 2009, Knowledge and Information Systems.

[15]  Chun-Gui Xu,et al.  A genetic programming-based approach to the classification of multiclass microarray datasets , 2009, Bioinform..

[16]  Fernando Díaz,et al.  An evolutionary computational model applied to cluster analysis of DNA microarray data , 2013, Expert Syst. Appl..

[17]  Miguel Toro,et al.  Evolutionary learning of hierarchical decision rules , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[18]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..

[19]  Angelo Gaeta,et al.  Resilience Analysis of Critical Infrastructures: A Cognitive Approach Based on Granular Computing , 2019, IEEE Transactions on Cybernetics.

[20]  Sung-Bae Cho,et al.  The classification of cancer based on DNA microarray data that uses diverse ensemble genetic programming , 2006, Artif. Intell. Medicine.

[21]  David G. Stork,et al.  Pattern Classification , 1973 .

[22]  Luis Muñoz,et al.  Evolving genetic programming classifiers with novelty search , 2016, Inf. Sci..

[23]  A Abu-Hanna,et al.  Prognostic methods in medicine. , 1999, Artificial intelligence in medicine.

[24]  Gisele L. Pappa,et al.  Automating the Design of Data Mining Algorithms: An Evolutionary Computation Approach , 2009 .

[25]  Xin Yao,et al.  A novel evolutionary data mining algorithm with applications to churn prediction , 2003, IEEE Trans. Evol. Comput..

[26]  Concha Bielza,et al.  Machine Learning in Bioinformatics , 2008, Encyclopedia of Database Systems.

[27]  Hitoshi Iba,et al.  Prediction of Cancer Class with Majority Voting Genetic Programming Classifier Using Gene Expression Data , 2009, TCBB.

[28]  Vili Podgorelec,et al.  Knowledge discovery with classification rules in a cardiovascular dataset , 2005, Comput. Methods Programs Biomed..

[29]  Leonardo Trujillo,et al.  A comparison of fitness-case sampling methods for genetic programming , 2017, J. Exp. Theor. Artif. Intell..

[30]  Hamido Fujita,et al.  Multi-view manifold regularized learning-based method for prioritizing candidate disease miRNAs , 2019, Knowl. Based Syst..

[31]  Paulo Novais,et al.  A visual analytics framework for cluster analysis of DNA microarray data , 2013, Expert Syst. Appl..

[32]  Stewart W. Wilson,et al.  Learning classifier systems: New models, successful applications , 2002, Inf. Process. Lett..

[33]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[34]  Moshe Sipper,et al.  Evolutionary computation in medicine: an overview , 2000, Artif. Intell. Medicine.

[35]  Randy L. Haupt,et al.  Practical Genetic Algorithms , 1998 .

[36]  Peter A. Flach,et al.  Machine Learning - The Art and Science of Algorithms that Make Sense of Data , 2012 .

[37]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[38]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[39]  Fangming Zhu,et al.  A new approach to mining fuzzy databases using nearest neighbor classification by exploiting attribute hierarchies , 2004 .

[40]  Hamido Fujita,et al.  Computer Aided detection for fibrillations and flutters using deep convolutional neural network , 2019, Inf. Sci..

[41]  Kay Chen Tan,et al.  A coevolutionary algorithm for rules discovery in data mining , 2006, Int. J. Syst. Sci..

[42]  Alex A. Freitas,et al.  A survey of evolutionary algorithms for data mining and knowledge discovery , 2003 .

[43]  Georgios Dounias,et al.  Evolving rule-based systems in two medical domains using genetic programming , 2004, Artif. Intell. Medicine.

[44]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[45]  Muhammad Waqar Aslam,et al.  Pattern recognition using genetic programming for classification of diabetes and modulation data , 2013 .

[46]  Francisco Herrera,et al.  Genetics-Based Machine Learning for Rule Induction: State of the Art, Taxonomy, and Comparative Study , 2010, IEEE Transactions on Evolutionary Computation.

[47]  Jesús S. Aguilar-Ruiz,et al.  Natural Encoding for Evolutionary Supervised Learning , 2007, IEEE Transactions on Evolutionary Computation.

[48]  Heitor Silvério Lopes,et al.  Computational Biology and Applied Bioinformatics , 2014 .

[49]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[50]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .