FireWorks: a dynamic workflow system designed for high‐throughput applications

This paper introduces FireWorks, a workflow software for running high‐throughput calculation workflows at supercomputing centers. FireWorks has been used to complete over 50 million CPU‐hours worth of computational chemistry and materials science calculations at the National Energy Research Supercomputing Center. It has been designed to serve the demanding high‐throughput computing needs of these applications, with extensive support for (i) concurrent execution through job packing, (ii) failure detection and correction, (iii) provenance and reporting for long‐running projects, (iv) automated duplicate detection, and (v) dynamic workflows (i.e., modifying the workflow graph during runtime). We have found that these features are highly relevant to enabling modern data‐driven and high‐throughput science applications, and we discuss our implementation strategy that rests on Python and NoSQL databases (MongoDB). Finally, we present performance data and limitations of our approach along with planned future work. Copyright © 2015 John Wiley & Sons, Ltd.

[1]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[2]  Ian J. Taylor,et al.  Visual Grid Workflow in Triana , 2005, Journal of Grid Computing.

[3]  Domenico Talia,et al.  Workflow Systems for Science: Concepts and Tools , 2013 .

[4]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[5]  Yong Zhao,et al.  Many-task computing for grids and supercomputers , 2008, 2008 Workshop on Many-Task Computing on Grids and Supercomputers.

[6]  Travis E. Oliphant,et al.  Python for Scientific Computing , 2007, Computing in Science & Engineering.

[7]  Rajesh Raman,et al.  Matchmaking: distributed resource management for high throughput computing , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[8]  Daniel S. Katz,et al.  Swift: A language for distributed parallel scripting , 2011, Parallel Comput..

[9]  Roy Fielding,et al.  Architectural Styles and the Design of Network-based Software Architectures"; Doctoral dissertation , 2000 .

[10]  Shreyas Cholia,et al.  NEWT: A RESTful service for building High Performance Computing web applications , 2010, 2010 Gateway Computing Environments Workshop (GCE).

[11]  Omer F. Rana,et al.  Adaptive exception handling for scientific workflows , 2010, Concurr. Comput. Pract. Exp..

[12]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[13]  Anubhav Jain,et al.  The Materials Application Programming Interface (API): A simple, flexible and efficient API for materials data based on REpresentational State Transfer (REST) principles , 2015 .

[14]  James R. Rustad,et al.  Density functional calculations of the enthalpies of formation of rare-earth orthophosphates , 2012 .

[15]  Stefan Goedecker,et al.  ABINIT: First-principles approach to material and nanosystem properties , 2009, Comput. Phys. Commun..

[16]  Bertram Ludäscher,et al.  Kepler: an extensible system for design and execution of scientific workflows , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[17]  Lavanya Ramakrishnan,et al.  Community Accessible Datastore of High-Throughput Calculations: Experiences from the Materials Project , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[18]  Richard Van Noorden The rechargeable revolution: A better battery , 2014, Nature.

[19]  Thorsten Meinl,et al.  KNIME: The Konstanz Information Miner , 2007, GfKl.

[20]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[21]  V. Curcin,et al.  Scientific workflow systems - can one size fit all? , 2008, 2008 Cairo International Biomedical Engineering Conference.

[22]  Timothy G. Armstrong INTEGRATING TASK PARALLELISM INTO THE PYTHON PROGRAMMING LANGUAGE , 2011 .

[23]  Kristin A. Persson,et al.  Commentary: The Materials Project: A materials genome approach to accelerating materials innovation , 2013 .

[24]  Lei Cheng,et al.  Accelerating Electrolyte Discovery for Energy Storage with High-Throughput Screening. , 2015, The journal of physical chemistry letters.

[25]  Hideo Hosono,et al.  High-throughput ab initio screening for two-dimensional electride materials. , 2014, Inorganic chemistry.

[26]  Anubhav Jain,et al.  Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis , 2012 .

[27]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[28]  Cormac Toher,et al.  Charting the complete elastic properties of inorganic crystalline compounds , 2015, Scientific Data.

[29]  Douglas Thain,et al.  Weaver: integrating distributed computing abstractions into scientific workflows using Python , 2010, HPDC '10.

[30]  Hans Petter Langtangen,et al.  Python scripting for computational science , 2004 .

[31]  Fernando Pérez,et al.  Python: An Ecosystem for Scientific Computing , 2011, Computing in Science & Engineering.

[32]  Jeff Weber,et al.  Workflow Management in Condor , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[33]  Gordon Bell,et al.  Beyond the Data Deluge , 2009, Science.

[34]  Andreas Hoheisel,et al.  User tools and languages for graph‐based Grid workflows , 2006, Concurr. Comput. Pract. Exp..

[35]  Martin Head-Gordon,et al.  Advances in Methods and Algorithms in a Modern Quantum Chemistry Program Package , 2006 .

[36]  Ondrej Gutten,et al.  Predicting the stability constants of metal-ion complexes from first principles. , 2013, Inorganic chemistry.

[37]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[38]  Lei Cheng,et al.  The Electrolyte Genome project: A big data approach in battery materials discovery , 2015 .

[39]  Aric Hagberg,et al.  Exploring Network Structure, Dynamics, and Function using NetworkX , 2008, Proceedings of the Python in Science Conference.

[40]  Yogesh L. Simmhan,et al.  The Trident Scientific Workflow Workbench , 2008, 2008 IEEE Fourth International Conference on eScience.

[41]  Brian E. Granger,et al.  IPython: A System for Interactive Scientific Computing , 2007, Computing in Science & Engineering.