Darwin or Lamarck? Future Challenges in Evolutionary Algorithms for Knowledge Discovery and Data Mining

Evolutionary Algorithms (EAs) are a fascinating branch of computational intelligence with much potential for use in many application areas. The fundamental principle of EAs is to use ideas inspired by the biological mechanisms observed in nature, such as selection and genetic changes, to find the best solution for a given optimization problem. Generally, EAs use iterative processes, by growing a population of solutions selected in a guided random search and using parallel processing, in order to achieve a desired result. Such population based approaches, for example particle swarm and ant colony optimization (inspired from biology), are among the most popular metaheuristic methods being used in machine learning, along with others such as the simulated annealing (inspired from thermodynamics). In this paper, we provide a short survey on the state-of-the-art of EAs, beginning with some background on the theory of evolution and contrasting the original ideas of Darwin and Lamarck; we then continue with a discussion on the analogy between biological and computational sciences, and briefly describe some fundamentals of EAs, including the Genetic Algorithms, Genetic Programming, Evolution Strategies, Swarm Intelligence Algorithms (i.e., Particle Swarm Optimization, Ant Colony Optimization, Bacteria Foraging Algorithms, Bees Algorithm, Invasive Weed Optimization), Memetic Search, Differential Evolution Search, Artificial Immune Systems, Gravitational Search Algorithm, Intelligent Water Drops Algorithm. We conclude with a short description of the usefulness of EAs for Knowledge Discovery and Data Mining tasks and present some open problems and challenges to further stimulate research.

[1]  Anne Auger,et al.  Theory of Evolution Strategies: A New Perspective , 2011, Theory of Randomized Search Heuristics.

[2]  Václav Snásel,et al.  A Modified Invasive Weed Optimization Algorithm for training of feed- forward Neural Networks , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[3]  Hong Zhu,et al.  Invasive Weed Optimization Algorithm for Optimizating the Parameters of Mixed Kernel Twin Support Vector Machines , 2013, J. Comput..

[4]  John R. Koza,et al.  Genetic programming as a means for programming computers by natural selection , 1994 .

[5]  Larry Bull,et al.  On the Baldwin Effect , 1999, Artificial Life.

[6]  Peter Pirolli,et al.  Information Foraging , 2009, Encyclopedia of Database Systems.

[7]  Christian Blum,et al.  Metaheuristics in combinatorial optimization: Overview and conceptual comparison , 2003, CSUR.

[8]  Yuhui Shi,et al.  Particle swarm optimization: developments, applications and resources , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[9]  Hossein Nezamabadi-pour,et al.  BGSA: binary gravitational search algorithm , 2010, Natural Computing.

[10]  A. Bennett The Origin of Species by means of Natural Selection; or the Preservation of Favoured Races in the Struggle for Life , 1872, Nature.

[11]  Andreas Holzinger,et al.  Darwin, Lamarck, or Baldwin: Applying Evolutionary Algorithms to Machine Learning Techniques , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[12]  Soojin V Yi,et al.  Epigenetics and evolution. , 2014, Integrative and comparative biology.

[13]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[14]  Marco Dorigo,et al.  Distributed Optimization by Ant Colonies , 1992 .

[15]  Thomas Bäck,et al.  A Survey of Evolution Strategies , 1991, ICGA.

[16]  Trevor Cohen,et al.  Discovery by scent: Discovery browsing system based on the Information Foraging Theory , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops.

[17]  Fernando Niño,et al.  Recent Advances in Artificial Immune Systems: Models and Applications , 2011, Appl. Soft Comput..

[18]  Kevin M. Passino,et al.  Biomimicry of bacterial foraging for distributed optimization and control , 2002 .

[19]  Ronen Feldman,et al.  The Data Mining and Knowledge Discovery Handbook , 2005 .

[20]  E. Jablonka,et al.  Epigenetic Inheritance and Evolution: The Lamarckian Dimension , 1995 .

[21]  Alex Alves Freitas,et al.  Data mining with an ant colony optimization algorithm , 2002, IEEE Trans. Evol. Comput..

[22]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[23]  Jason H. Moore,et al.  Genetic programming neural networks: A powerful bioinformatics tool for human genetics , 2007, Appl. Soft Comput..

[24]  John H. Holland,et al.  Outline for a Logical Theory of Adaptive Systems , 1962, JACM.

[25]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[26]  Bart Baesens,et al.  Editorial survey: swarm intelligence for data mining , 2010, Machine Learning.

[27]  Dong Hwa Kim,et al.  A hybrid genetic algorithm and bacterial foraging approach for global optimization , 2007, Inf. Sci..

[28]  Keith L. Downing,et al.  Introduction to Evolutionary Algorithms , 2006 .

[29]  S Forrest,et al.  Genetic algorithms , 1996, CSUR.

[30]  Muhammad Khurram Khan,et al.  An effective memetic differential evolution algorithm based on chaotic local search , 2011, Inf. Sci..

[31]  Hari Balakrishnan,et al.  TCP ex machina: computer-generated congestion control , 2013, SIGCOMM.

[32]  D. Penny,et al.  Branch and bound algorithms to determine minimal evolutionary trees , 1982 .

[33]  W. Fitch,et al.  Construction of phylogenetic trees. , 1967, Science.

[34]  Andrew Lewis,et al.  Grey Wolf Optimizer , 2014, Adv. Eng. Softw..

[35]  Lothar Thiele,et al.  Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach , 1999, IEEE Trans. Evol. Comput..

[36]  Alex Alves Freitas,et al.  Discovering New Rule Induction Algorithms with Grammar-based Genetic Programming , 2008, Soft Computing for Knowledge Discovery and Data Mining.

[37]  Jiming Liu,et al.  Characterizing Web usage regularities with information foraging agents , 2004, IEEE Transactions on Knowledge and Data Engineering.

[38]  Hans-Paul Schwefel,et al.  Evolution strategies – A comprehensive introduction , 2002, Natural Computing.

[39]  Mitsuo Gen,et al.  Genetic algorithms and engineering optimization , 1999 .

[40]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[41]  Irving M. Klotz,et al.  Symposia of the Society for Experimental Biology , 1952, The Yale Journal of Biology and Medicine.

[42]  Francesco Pappalardo,et al.  Discovery of cancer vaccination protocols with a genetic algorithm driving an agent based simulator , 2006, BMC Bioinformatics.

[43]  Michael Affenzeller,et al.  Music Segmentation With Genetic Algorithms , 2009, 2009 20th International Workshop on Database and Expert Systems Application.

[44]  Alan S. Perelson,et al.  The immune system, adaptation, and machine learning , 1986 .

[45]  Jonathan Timmis,et al.  Artificial immune systems as a novel soft computing paradigm , 2003, Soft Comput..

[46]  Frank Hoffmeister,et al.  Scalable Parallelism by Evolutionary Algorithms , 1991 .

[47]  George E. P. Box,et al.  Evolutionary Operation: a Method for Increasing Industrial Productivity , 1957 .

[48]  Marco Dorigo,et al.  Ant colony optimization theory: A survey , 2005, Theor. Comput. Sci..

[49]  Praveen Ranjan Srivastava,et al.  Code coverage using intelligent water drop (IWD) , 2012, Int. J. Bio Inspired Comput..

[50]  Caro Lucas,et al.  A novel numerical optimization algorithm inspired from weed colonization , 2006, Ecol. Informatics.

[51]  Frank Klawonn,et al.  Künstliche neuronale Netze , 2011 .

[52]  Haibing Li,et al.  Applying Ant Colony Optimization to configuring stacking ensembles for data mining , 2014, Expert Syst. Appl..

[53]  Lale Özbakir,et al.  Bees algorithm for generalized assignment problem , 2010, Appl. Math. Comput..

[54]  Hossein Nezamabadi-pour,et al.  A discrete gravitational search algorithm for solving combinatorial optimization problems , 2014, Inf. Sci..

[55]  Andreas Zell,et al.  Evolution strategy with neighborhood attraction – a robust evolution strategy , 2001 .

[56]  Anne Auger,et al.  Theory of Randomized Search Heuristics: Foundations and Recent Developments , 2011, Theory of Randomized Search Heuristics.

[57]  C. Waddington Canalization of Development and the Inheritance of Acquired Characters , 1942, Nature.

[58]  Wolfgang Banzhaf,et al.  A comparison of linear genetic programming and neural networks in medical data mining , 2001, IEEE Trans. Evol. Comput..

[59]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[60]  Thomas Jansen,et al.  Analyzing Evolutionary Algorithms: The Computer Science Perspective , 2012 .

[61]  Marco Dorigo,et al.  An Investigation of some Properties of an "Ant Algorithm" , 1992, PPSN.

[62]  C.A. Coello Coello,et al.  MOPSO: a proposal for multiple objective particle swarm optimization , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[63]  P. D. de Boer,et al.  Advances in understanding E. coli cell fission. , 2010, Current opinion in microbiology.

[64]  Liu Junlan,et al.  Covering fuzzy S-rough sets model , 2011 .

[65]  E. Bonabeau Decisions 2.0: the power of collective intelligence , 2009 .

[66]  Paula A. Kiberstis All Eyes on Epigenetics , 2012 .

[67]  Sukumar Mishra,et al.  Transmission Loss Reduction Based on FACTS and Bacteria Foraging Algorithm , 2006, PPSN.

[68]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[69]  L. Darrell Whitley,et al.  An overview of evolutionary algorithms: practical issues and common pitfalls , 2001, Inf. Softw. Technol..

[70]  Kwang Mong Sim,et al.  Ant colony optimization for routing and load-balancing: survey and new directions , 2003, IEEE Trans. Syst. Man Cybern. Part A.

[71]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[72]  Don Tapscott,et al.  Grown Up Digital: How the Net Generation is Changing Your World , 2008 .

[73]  Hamed Shah-Hosseini,et al.  The intelligent water drops algorithm: a nature-inspired swarm-based optimization algorithm , 2009, Int. J. Bio Inspired Comput..

[74]  R. Lewontin ‘The Selfish Gene’ , 1977, Nature.

[75]  Adrian Bird,et al.  Perceptions of epigenetics , 2007, Nature.

[76]  Thomas Jansen,et al.  Analyzing Evolutionary Algorithms , 2015, Natural Computing Series.

[77]  Bob J. Wielinga,et al.  The Mars crowdsourcing experiment: Is crowdsourcing in the form of a serious game applicable for annotation in a semantically-rich research domain? , 2011, 2011 16th International Conference on Computer Games (CGAMES).

[78]  Hossein Nezamabadi-pour,et al.  GSA: A Gravitational Search Algorithm , 2009, Inf. Sci..

[79]  Hans-Georg Beyer,et al.  The Theory of Evolution Strategies , 2001, Natural Computing Series.

[80]  N. Ramaraj,et al.  A novel hybrid feature selection via Symmetrical Uncertainty ranking based local memetic search algorithm , 2010, Knowl. Based Syst..

[81]  Duc Truong Pham,et al.  The Bees Algorithm: Modelling foraging behaviour to solve continuous optimization problems , 2009 .

[82]  S. Jonjić,et al.  Modulation of natural killer cell activity by viruses. , 2010, Current opinion in microbiology.

[83]  Irfan Akca,et al.  Application of Genetic Algorithms in Seismic Tomography , 2010 .

[84]  Hsing-Chih Tsai,et al.  Integrating the artificial bee colony and bees algorithm to face constrained optimization problems , 2014, Inf. Sci..

[85]  John J. Grefenstette,et al.  Optimization of Control Parameters for Genetic Algorithms , 1986, IEEE Transactions on Systems, Man, and Cybernetics.

[86]  Alex Alves Freitas,et al.  Evolving rule induction algorithms with multi-objective grammar-based genetic programming , 2009, Knowledge and Information Systems.

[87]  Hisao Ishibuchi,et al.  Balance between genetic search and local search in memetic algorithms for multiobjective permutation flowshop scheduling , 2003, IEEE Trans. Evol. Comput..

[88]  S. Pratt,et al.  Information flow, opinion polling and collective intelligence in house-hunting social insects. , 2002, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[89]  Peer Bork,et al.  Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation , 2007, Bioinform..

[90]  Colin R. Reeves,et al.  Evolutionary computation: a unified approach , 2007, Genetic Programming and Evolvable Machines.

[91]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[92]  Peter Pirolli,et al.  Rational Analyses of Information Foraging on the Web , 2005, Cogn. Sci..

[93]  R. W. Dobbins,et al.  Computational intelligence PC tools , 1996 .

[94]  Edmund K. Burke,et al.  Parallel Problem Solving from Nature - PPSN IX: 9th International Conference, Reykjavik, Iceland, September 9-13, 2006, Proceedings , 2006, PPSN.

[95]  Dervis Karaboga,et al.  A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm , 2007, J. Glob. Optim..

[96]  David Coley,et al.  Introduction to Genetic Algorithms for Scientists and Engineers , 1999 .

[97]  Francisco Herrera,et al.  Memetic algorithms based on local search chains for large scale continuous optimisation problems: MA-SSW-Chains , 2011, Soft Comput..

[98]  Andreas Holzinger,et al.  Functional and genetic analysis of the colon cancer network , 2014, BMC Bioinformatics.

[99]  Hamed Shah-Hosseini,et al.  Problem solving by intelligent water drops , 2007, 2007 IEEE Congress on Evolutionary Computation.

[100]  D. Pham,et al.  THE BEES ALGORITHM, A NOVEL TOOL FOR COMPLEX OPTIMISATION PROBLEMS , 2006 .

[101]  Kazuhiro Ohkura,et al.  Robust Evolution Strategies , 1998, Applied Intelligence.

[102]  Alex A. Freitas,et al.  A survey of evolutionary algorithms for data mining and knowledge discovery , 2003 .

[103]  Meng Hongyun Artificial bee colony algorithm with chaotic differential evolution search , 2011 .

[104]  Sreeram V Ramagopalan,et al.  Is Lamarckian evolution relevant to medicine? , 2010, BMC Medical Genetics.

[105]  Mauro Birattari,et al.  Swarm Intelligence , 2012, Lecture Notes in Computer Science.

[106]  Cheng-Chew Lim,et al.  Optimizing system-on-chip verifications with multi-objective genetic evolutionary algorithms , 2013 .

[107]  P. A. J. Boer,et al.  Advances in understanding E. coli cell fission , 2010 .

[108]  Giovanni Squillero,et al.  The selfish gene algorithm: a new evolutionary optimization strategy , 1998, SAC '98.

[109]  Thomas Stützle,et al.  Ant colony optimization: artificial ants as a computational intelligence technique , 2006 .

[110]  D. Meyer,et al.  Supporting Online Material Materials and Methods Som Text Figs. S1 to S6 References Evidence for a Collective Intelligence Factor in the Performance of Human Groups , 2022 .

[111]  Frank Klawonn,et al.  Computational Intelligence: A Methodological Introduction , 2015, Texts in Computer Science.