A Survey of Genetic Algorithms for Multi-Label Classification

In recent years, multi-label classification (MLC) has become an emerging research topic in big data analytics and machine learning. In this problem, each object of a dataset may belong to multiple class labels and the goal is to learn a classification model that can infer the correct labels of new, previously unseen, objects. This paper presents a survey of genetic algorithms (GAs) designed for MLC tasks. The study is organized in three parts. First, we propose a new taxonomy focused on GAs for MLC. In the second part, we provide an up-to-date overview of the work in this area, categorizing the approaches identified in the literature with respect to the taxonomy. In the third and last part, we discuss some new ideas for combining GAs with MLC.

[1]  Alex Alves Freitas A Review of evolutionary Algorithms for Data Mining , 2008, Soft Computing for Knowledge Discovery and Data Mining.

[2]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[3]  Alex Alves Freitas,et al.  A Lexicographic Multi-Objective Genetic Algorithm for Multi-Label Correlation Based Feature Selection , 2015, GECCO.

[4]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[5]  Alex Alves Freitas,et al.  Distinct Chains for Different Instances: An Effective Strategy for Multi-label Classifier Chains , 2014, ECML/PKDD.

[6]  Víctor Robles,et al.  Feature selection for multi-label naive Bayes classification , 2009, Inf. Sci..

[7]  Xin Yao,et al.  A Survey on Evolutionary Computation Approaches to Feature Selection , 2016, IEEE Transactions on Evolutionary Computation.

[8]  Sung-Bae Cho,et al.  Efficient huge-scale feature selection with speciated genetic algorithm , 2005 .

[9]  Feng Liu,et al.  Predicting drug side effects by multi-label learning and ensemble learning , 2015, BMC Bioinformatics.

[10]  Gisele L. Pappa Multiobjective Genetic Algorithms for Attribute Selection , 2002 .

[11]  Sebastián Ventura,et al.  A Gene Expression Programming Algorithm for Multi-Label Classification , 2011, J. Multiple Valued Log. Soft Comput..

[12]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[13]  Yuhong Guo,et al.  Multi-Label Classification Using Conditional Dependency Networks , 2011, IJCAI.

[14]  Yong Zhang,et al.  A PSO-based multi-objective multi-label feature selection method in classification , 2017, Scientific Reports.

[15]  Dr. Alex A. Freitas Data Mining and Knowledge Discovery with Evolutionary Algorithms , 2002, Natural Computing Series.

[16]  Keke Gai,et al.  An Empirical Study on Preprocessing High-Dimensional Class-Imbalanced Data for Classification , 2015, 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems.

[17]  Fernando E. B. Otero,et al.  Genetic Programming for Attribute Construction in Data Mining , 2002, EuroGP.

[18]  Alessandra Alaniz Macedo,et al.  A multi-label approach using binary relevance and decision trees applied to functional genomics , 2015, J. Biomed. Informatics.

[19]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[20]  Alex Alves Freitas,et al.  Simpler is Better: a Novel Genetic Algorithm to Induce Compact Multi-label Chain Classifiers , 2015, GECCO.

[21]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[22]  Lei Tang,et al.  Large scale multi-label classification via metalabeler , 2009, WWW '09.

[23]  Alex A. Freitas,et al.  A survey of hierarchical classification across different application domains , 2010, Data Mining and Knowledge Discovery.

[24]  Krista A. Ehinger,et al.  SUN Database: Exploring a Large Collection of Scene Categories , 2014, International Journal of Computer Vision.

[25]  Wang Zhihai,et al.  Genetic Algorithm Based on Attribute Correlation for Multi-label Classification , 2017, ICML 2017.

[26]  Min-Ling Zhang,et al.  Ml-rbf: RBF Neural Networks for Multi-Label Learning , 2009, Neural Processing Letters.

[27]  Alex Alves Freitas,et al.  Comprehensible classification models: a position paper , 2014, SKDD.

[28]  Qingming Huang,et al.  Multi-label classification by exploiting local positive and negative pairwise label correlation , 2017, Neurocomputing.

[29]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[30]  Mohak Shah,et al.  Evaluating Learning Algorithms: A Classification Perspective , 2011 .

[31]  Fabrizio Sebastiani Text Categorization , 2005, Encyclopedia of Database Technologies and Applications.

[32]  Alex A. Freitas,et al.  Discovering comprehensible classification rules with a genetic algorithm , 2000, Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512).

[33]  Bianca Zadrozny,et al.  Correlation analysis of performance measures for multi-label classification , 2018, Inf. Process. Manag..

[34]  Myong Kee Jeong,et al.  An evolutionary algorithm with the partial sequential forward floating search mutation for large-scale feature selection problems , 2015, J. Oper. Res. Soc..

[35]  Pengpeng Zhao,et al.  Multi-label active learning with label correlation for image classification , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[36]  Eyke Hüllermeier,et al.  Dependent binary relevance models for multi-label classification , 2014, Pattern Recognit..

[37]  Alexandre Plastino,et al.  Automatic classification of carbonate rocks permeability from 1H NMR relaxation data , 2015, Expert systems with applications.

[38]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[39]  Alex Alves Freitas,et al.  A grammatical evolution algorithm for generation of Hierarchical Multi-Label Classification rules , 2013, 2013 IEEE Congress on Evolutionary Computation.

[40]  Philip S. Yu,et al.  Multi-label Ensemble Learning , 2011, ECML/PKDD.

[41]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  A genetic algorithm for Hierarchical Multi-Label Classification , 2012, SAC '12.

[42]  Gisele L. Pappa,et al.  A Multiobjective Genetic Algorithm for Attribute Selection , 2002 .

[43]  Philip S. Yu,et al.  Multi-Objective Multi-Label Classification , 2012, SDM.

[44]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[45]  Pericles A. Mitkas,et al.  Inducing Generalized Multi-Label Rules with Learning Classifier Systems , 2015, ArXiv.

[46]  Sebastián Ventura,et al.  A Grammar-Guided Genetic Programming Algorithm for Multi-Label Classification , 2013, EuroGP.

[47]  Xavier Serra,et al.  Multi-Label Music Genre Classification from Audio, Text and Images Using Deep Features , 2017, ISMIR.

[48]  Teresa Gonçalves,et al.  A Preliminary Approach to the Multilabel Classification Problem of Portuguese Juridical Documents , 2003, EPIA.

[49]  Geoff Holmes,et al.  MEKA: A Multi-label/Multi-target Extension to WEKA , 2016, J. Mach. Learn. Res..

[50]  Alex Alves Freitas,et al.  A Tutorial on Multi-label Classification Techniques , 2009, Foundations of Computational Intelligence.

[51]  Alex Alves Freitas,et al.  A critical review of multi-objective optimization in data mining: a position paper , 2004, SKDD.

[52]  Valery Naranjo,et al.  Evolving Deep Neural Networks architectures for Android malware classification , 2017, 2017 IEEE Congress on Evolutionary Computation (CEC).

[53]  Alex Alves Freitas,et al.  A new genetic algorithm for multi-label correlation-based feature selection , 2015, ESANN.

[54]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[55]  Dae-Won Kim,et al.  Memetic feature selection algorithm for multi-label classification , 2015, Inf. Sci..

[56]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[57]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[58]  Yang Feng,et al.  Towards more accurate multi-label software behavior learning , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[59]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[60]  Kun Zhang,et al.  Multi-label learning by exploiting label dependency , 2010, KDD.

[61]  Alex Alves Freitas,et al.  Towards a method for automatically selecting and configuring multi-label classification algorithms , 2017, GECCO.

[62]  Sebastián Ventura,et al.  LAIM discretization for multi-label data , 2016, Inf. Sci..

[63]  Nathan S. Netanyahu,et al.  Painter classification using genetic algorithms , 2013, 2013 IEEE Congress on Evolutionary Computation.

[64]  Alex Alves Freitas,et al.  A Genetic Algorithm for Optimizing the Label Ordering in Multi-label Classifier Chains , 2013, 2013 IEEE 25th International Conference on Tools with Artificial Intelligence.

[65]  Bianca Zadrozny,et al.  Categorizing feature selection methods for multi-label classification , 2016, Artificial Intelligence Review.

[66]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[67]  Grigorios Tsoumakas,et al.  MULAN: A Java Library for Multi-Label Learning , 2011, J. Mach. Learn. Res..

[68]  Sebastián Ventura,et al.  A Tutorial on Multilabel Learning , 2015, ACM Comput. Surv..