ECJ+HADOOP: An Easy Way to Deploy Massive Runs of Evolutionary Algorithms

This paper describes initial steps towards allowing Evolutionary Algorithms (EAs) researchers to easily deploy computing intensive runs of EAs on Big Data infrastructures. Although many proposals have already been described in the literature, and a number of new software tools have been implemented embodying parallel versions of EAs, we present here a different approach. Given traditional resistance to change when adopting new software, we try instead to endow the well known ECJ tool with the MapReduce model. By using the Hadoop framework, we introduce changes in ECJ that allow researchers to launch any EA problem on a big data infrastructure similarly as when a single computer is used to run the algorithm. By means of a new parameter, researchers can choose where the run will be launched, whether in a Hadoop based infrastructure or in a desktop computer. This paper shows the tests performed, how the whole system has been tuned to optimize the running time for ECJ experiments, and finally a realworld problem is shown to describe how the MapReduce model can automatically deploy the tasks generated by ECJ without additional intervention.

[1]  F. Prieto,et al.  Extracción de puntos característicos del rostro para medidas antropométricas , 2010 .

[2]  Juan Julián Merelo Guervós,et al.  EvAg: a scalable peer-to-peer evolutionary algorithm , 2010, Genetic Programming and Evolvable Machines.

[3]  Erick Cantú-Paz,et al.  Efficient and Accurate Parallel Genetic Algorithms , 2000, Genetic Algorithms and Evolutionary Computation.

[4]  Kalyan Veeramachaneni,et al.  Flex-GP: Genetic Programming on the Cloud , 2012, EvoApplications.

[5]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[6]  Zhiqiang Yao,et al.  High performance parallel evolutionary algorithm model based on MapReduce framework , 2013, Int. J. Comput. Appl. Technol..

[7]  Lei Zhang,et al.  Sparse representation or collaborative representation: Which helps face recognition? , 2011, 2011 International Conference on Computer Vision.

[8]  Sanjay Ghemawat,et al.  MapReduce: a flexible data processing tool , 2010, CACM.

[9]  Wei Li,et al.  A Fast Face Recognition Algorithm Based on MapReduce , 2014, 2014 Seventh International Symposium on Computational Intelligence and Design.

[10]  Leonardo Vanneschi,et al.  An Empirical Study of Multipopulation Genetic Programming , 2003, Genetic Programming and Evolvable Machines.

[11]  Marco Tomassini,et al.  A Parallel Genetic Programming Tool Based on PVM , 1999, PVM/MPI.

[12]  Juan Julián Merelo Guervós,et al.  Increasing GP Computing Power for Free via Desktop GRID Computing and Virtualization , 2009, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing.

[13]  Werner Mellis,et al.  Success factors of organizational change in software process improvement , 1998 .

[14]  Hossein Mobahi,et al.  Toward a Practical Face Recognition System: Robust Alignment and Illumination by Sparse Representation , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[16]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Anil K. Jain,et al.  Face Detection in Color Images , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[19]  Leonardo Vanneschi,et al.  An MPI-based tool for distributed genetic programming , 2000, Proceedings IEEE International Conference on Cluster Computing. CLUSTER 2000.

[20]  El-Ghazali Talbi,et al.  Grid computing for parallel bioinspired algorithms , 2006, J. Parallel Distributed Comput..

[21]  Ho-Hyun Park,et al.  Tagging and classifying facial images in cloud environments based on KNN using MapReduce , 2015 .

[22]  Juan Julián Merelo Guervós,et al.  EvoSpace: A Distributed Evolutionary Platform Based on the Tuple Space Model , 2013, EvoApplications.

[23]  Narendra Ahuja,et al.  Detecting Faces in Images: A Survey , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[25]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[26]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).