Scaling eCGA model building via data-intensive computing

This paper shows how the extended compact genetic algorithm can be scaled using data-intensive computing techniques such as MapReduce. Two different frameworks (Hadoop and MongoDB) are used to deploy MapReduce implementations of the compact and extended compact genetic algorithms. Results show that both are good choices to deal with large-scale problems as they can scale with the number of commodity machines, as opposed to previous efforts with other techniques that either required specialized high-performance hardware or shared memory environments.

[1]  Kalyanmoy Deb,et al.  Messy Genetic Algorithms: Motivation, Analysis, and First Results , 1989, Complex Syst..

[2]  Kalyanmoy Deb,et al.  Genetic Algorithms, Noise, and the Sizing of Populations , 1992, Complex Syst..

[3]  Rich Caruana,et al.  Removing the Genetics from the Standard Genetic Algorithm , 1995, ICML.

[4]  H. Mühlenbein,et al.  From Recombination of Genes to the Estimation of Distributions I. Binary Parameters , 1996, PPSN.

[5]  Heinz Mühlenbein,et al.  The Equation for Response to Selection and Its Use for Prediction , 1997, Evolutionary Computation.

[6]  David E. Goldberg,et al.  The compact genetic algorithm , 1999, IEEE Trans. Evol. Comput..

[7]  Fernando G. Lobo,et al.  Extended Compact Genetic Algorithm in C , 1999 .

[8]  G. Harik Linkage Learning via Probabilistic Modeling in the ECGA , 1999 .

[9]  Erick Cantú-Paz,et al.  Efficient and Accurate Parallel Genetic Algorithms , 2000, Genetic Algorithms and Evolutionary Computation.

[10]  David E. Goldberg,et al.  The Design of Innovation: Lessons from and for Competent Genetic Algorithms , 2002 .

[11]  David E. Goldberg,et al.  A Survey of Optimization by Building and Using Probabilistic Models , 2002, Comput. Optim. Appl..

[12]  Ian T. Foster,et al.  The virtual data grid: a new model and architecture for data-intensive collaboration , 2003, 15th International Conference on Scientific and Statistical Database Management, 2003..

[13]  David E. Goldberg,et al.  Efficiency Enhancement of Probabilistic Model Building Genetic Algorithms , 2004, ArXiv.

[14]  Muthu Dayalan,et al.  MapReduce : Simplified Data Processing on Large Cluster , 2018 .

[15]  David E. Goldberg,et al.  Designing Competent Mutation Operators Via Probabilistic Model Building of Neighborhoods , 2004, GECCO.

[16]  J. A. Lozano,et al.  Estimation of Distribution Algorithms , 2002, Genetic Algorithms and Evolutionary Computation.

[17]  Xavier Llorà,et al.  Fast rule matching for learning classifier systems via vector instructions , 2006, GECCO '06.

[18]  Xavier Llorà,et al.  Towards billion-bit optimization via a parallel estimation of distribution algorithm , 2007, GECCO '07.

[19]  Zhong-Xian Chi,et al.  An Efficient Fine-grained Parallel Genetic Algorithm Based on GPU-Accelerated , 2007, 2007 IFIP International Conference on Network and Parallel Computing Workshops (NPC 2007).

[20]  Rajkumar Buyya,et al.  MRPGA: An Extension of MapReduce for Parallelizing Genetic Algorithms , 2008, 2008 IEEE Fourth International Conference on eScience.

[21]  Xavier Llorà,et al.  Meandre: Semantic-Driven Data-Intensive Flows in the Clouds , 2008, 2008 IEEE Fourth International Conference on eScience.

[22]  Griffin Caprio,et al.  Parallel Metaheuristics , 2008, IEEE Distributed Systems Online.

[23]  David E. Goldberg,et al.  Enhancing the Efficiency of the ECGA , 2008, PPSN.

[24]  Xavier Llorà Data-intensive computing for competent genetic algorithms: a pilot study using meandre , 2009, GECCO '09.

[25]  Xavier Llorà,et al.  Scaling Genetic Algorithms Using MapReduce , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[26]  Man Leung Wong,et al.  Parallel multi-objective evolutionary algorithms on graphics processing units , 2009, GECCO '09.

[27]  Asim Munawar,et al.  Hybrid of genetic algorithm and local search to solve MAX-SAT problem using nVidia CUDA framework , 2009, Genetic Programming and Evolvable Machines.

[28]  Xavier Llorà,et al.  When Huge Is Routine: Scaling Genetic Algorithms and Estimation of Distribution Algorithms via Data-Intensive Computing , 2010, Parallel and Distributed Computational Intelligence.