Evolutionary Scientific Workflows

Evolutionary computing has become an alternative to classical computational models since it can solve, efficiently, hard scientific problems in polynomial time. Workflows support the automation of scientific processes, providing mechanisms that underpin modern computational science. However, the traditional workflows offer little support to evolutionary computing. Thus, to bridge this gap, this paper describes the VisPyGMO. It extends the existing scientific workflow management systems by offering a range of reusable evolutionary algorithms modules. Besides, we present a use case using evolutionary algorithms in VisTrails system to analyze more than 20 years of historical data retrieved from Kaggle platform and to predict pesticide consumption in fresh foods. The results presented demonstrate the feasibility of the proposal and demonstrate that significant optimization can be achieved with an evolutionary approach for the optimization of state-of-the-art scientific workflows.

[1]  Carole A. Goble,et al.  Taverna/myGrid: Aligning a Workflow System with the Life Sciences Community , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[2]  Daniel S. Katz,et al.  Special issue on eScience infrastructure and applications , 2014, Future Gener. Comput. Syst..

[3]  Rajkumar Buyya,et al.  A Taxonomy of Workflow Management Systems for Grid Computing , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[4]  Sérgio Manuel Serra da Cruz,et al.  Towards an e-infrastructure for Open Science in Soils Security , 2018 .

[5]  Simon J. Cox,et al.  Workflow Support for Advanced Grid-Enabled Computing , 2004 .

[6]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[7]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[8]  Marta Mattoso,et al.  Towards a Taxonomy of Provenance in Scientific Workflow Management Systems , 2009, 2009 Congress on Services - I.

[9]  David Abramson,et al.  Multi-objective optimisation in scientific workflow , 2017, ICCS.

[10]  Daniel J. Blankenberg,et al.  Galaxy: A Web‐Based Genome Analysis Tool for Experimentalists , 2010, Current protocols in molecular biology.

[11]  Dario Izzo,et al.  Parallel global optimisation meta-heuristics using an asynchronous island-model , 2009, 2009 IEEE Congress on Evolutionary Computation.

[12]  Srinath Perera,et al.  Apache airavata: a framework for distributed applications and computational workflows , 2011, GCE '11.

[13]  Ian M. Mitchell,et al.  Reproducible research for scientific computing: Tools and strategies for changing the culture , 2012, Computing in Science & Engineering.

[14]  Peter J. Bentley,et al.  New trends in evolutionary computation , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[15]  Matthew S. Shields Control- Versus Data-Driven Workflows , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[16]  A. E. Eiben,et al.  From evolutionary computation to the evolution of things , 2015, Nature.

[17]  Shiyong Lu,et al.  A System Architecture for Running Big Data Workflows in the Cloud , 2014, 2014 IEEE International Conference on Services Computing.

[18]  Zbigniew Michalewicz,et al.  Handbook of Evolutionary Computation , 1997 .

[19]  Juliana Freire,et al.  Provenance and scientific workflows: challenges and opportunities , 2008, SIGMOD Conference.

[20]  Ian J. Taylor,et al.  The Triana Workflow Environment: Architecture and Applications , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[21]  Dario Izzo,et al.  On the impact of the migration topology on the Island Model , 2010, Parallel Comput..

[22]  Cláudio T. Silva,et al.  Provenance for Computational Tasks: A Survey , 2008, Computing in Science & Engineering.

[23]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[24]  Cláudio T. Silva,et al.  VisTrails: visualization meets data management , 2006, SIGMOD Conference.

[25]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[26]  Richard McClatchey,et al.  Adapting scientific workflow structures using multi-objective optimization strategies , 2013, TAAS.

[27]  Wil vanderAalst,et al.  Workflow Management: Models, Methods, and Systems , 2004 .

[28]  Dario Izzo,et al.  The asynchronous island model and NSGA-II: study of a new migration operator and its performance , 2013, GECCO '13.

[29]  Julian Padget,et al.  Engineering design optimization using services and workflows , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[30]  Anantharaman Kalyanaraman,et al.  Design and Implementation of Kepler Workflows for BioEarth , 2014, ICCS.

[31]  Dervis Karaboga,et al.  AN IDEA BASED ON HONEY BEE SWARM FOR NUMERICAL OPTIMIZATION , 2005 .

[32]  Marta Mattoso,et al.  Towards supporting the life cycle of large scale scientific experiments , 2010, Int. J. Bus. Process. Integr. Manag..

[33]  Dario Izzo,et al.  The Generalized Island Model , 2012, Parallel Architectures and Bioinspired Algorithms.

[34]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[35]  El-Ghazali Talbi,et al.  Metaheuristics - From Design to Implementation , 2009 .

[36]  Anton Nekrutenko,et al.  Ten Simple Rules for Reproducible Computational Research , 2013, PLoS Comput. Biol..

[37]  Marjan Mernik,et al.  Exploration and exploitation in evolutionary algorithms: A survey , 2013, CSUR.

[38]  David Abramson,et al.  Embedding optimization in computational science workflows , 2010, J. Comput. Sci..

[39]  Ian J. Taylor,et al.  Workflows and e-Science: An overview of workflow system features and capabilities , 2009, Future Gener. Comput. Syst..

[40]  Bertram Ludäscher,et al.  Scientific workflow management and the Kepler system: Research Articles , 2006 .