Parameter Exploration in Science and Engineering Using Many-Task Computing

Robust scientific methods require the exploration of the parameter space of a system (some of which can be run in parallel on distributed resources), and may involve complete state space exploration, experimental design, or numerical optimization techniques. Many-Task Computing (MTC) provides a framework for performing robust design, because it supports the execution of a large number of otherwise independent processes. Further, scientific workflow engines facilitate the specification and execution of complex software pipelines, such as those found in real science and engineering design problems. However, most existing workflow engines do not support a wide range of experimentation techniques, nor do they support a large number of independent tasks. In this paper, we discuss Nimrod/K - a set of add in components and a new run time machine for a general workflow engine, Kepler. Nimrod/K provides an execution architecture based on the tagged dataflow concepts, developed in 1980s for highly parallel machines. This is embodied in a new Kepler "Director” that supports many-task computing by orchestrating execution of tasks on on clusters, Grids, and Clouds. Further, Nimrod/K provides a set of "Actors” that facilitate the various modes of parameter exploration discussed above. We demonstrate the power of Nimrod/K to solve real problems in cardiac science.

[1]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[2]  Rajkumar Buyya,et al.  A Taxonomy of Workflow Management Systems for Grid Computing , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[3]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[4]  Hong Linh Truong,et al.  ASKALON: a tool set for cluster and Grid computing: Research Articles , 2005 .

[5]  Anushka Michailova,et al.  Modeling transmural heterogeneity of K(ATP) current in rabbit ventricular myocytes. , 2007, American journal of physiology. Cell physiology.

[6]  David Abramson,et al.  Fault‐tolerant execution of large parameter sweep applications across multiple VOs with storage constraints , 2009, Concurr. Comput. Pract. Exp..

[7]  David E. Culler,et al.  Managing parallelism and resources in scientific dataflow programs , 1989 .

[8]  David Abramson,et al.  High performance parametric modeling with Nimrod/G: killer application for the global grid? , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[9]  Yong Zhao,et al.  Many-task computing for grids and supercomputers , 2008, 2008 Workshop on Many-Task Computing on Grids and Supercomputers.

[10]  Andrew Lewis,et al.  Optimization Using Nimrod/O and Its Application to Robust Mechanical Design , 2003, PPAM.

[11]  Bertram Ludäscher,et al.  A Framework for the Design and Reuse of Grid Workflows , 2004, SAG.

[12]  Miron Livny,et al.  Condor: a distributed job scheduler , 2001 .

[13]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[14]  Kaizar Amin,et al.  GridAnt: a client-controllable grid workflow system , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.

[15]  R. Buyya,et al.  Market-Oriented Grid and Utility Computing , 2009 .

[16]  Radu Prodan,et al.  ASKALON: a tool set for cluster and Grid computing , 2005, Concurr. Pract. Exp..

[17]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[18]  Edward A. Lee,et al.  Composing Different Models of Computation in Kepler and Ptolemy II , 2007, International Conference on Computational Science.

[19]  Edward A. Lee,et al.  Taming heterogeneity - the Ptolemy approach , 2003, Proc. IEEE.

[20]  David Abramson,et al.  Grid Interoperability: An Experiment in Bridging Grid Islands , 2008, 2008 IEEE Fourth International Conference on eScience.

[21]  Andrew Lewis,et al.  Model Optimization and Parameter Estimation with Nimrod/O , 2006, International Conference on Computational Science.

[22]  Yolanda Gil,et al.  Pegasus: Mapping Scientific Workflows onto the Grid , 2004, European Across Grids Conference.

[23]  Ian T. Foster,et al.  Globus Toolkit Version 4: Software for Service-Oriented Systems , 2005, Journal of Computer Science and Technology.

[24]  Jason Beringer,et al.  Influence of savanna fire on Australian monsoon season precipitation and circulation as simulated using a distributed computing environment , 2007 .

[25]  David Abramson,et al.  Optimizing cardiac excitation-metabolic model by using parallel grid computing , 2008 .

[26]  Mark S. Gordon,et al.  General atomic and molecular electronic structure system , 1993, J. Comput. Chem..

[27]  D. Abramson,et al.  An Automatic Design Optimization Tool and its Application to Computational Fluid Dynamics , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[28]  Sidney Addelman,et al.  trans-Dimethanolbis(1,1,1-trifluoro-5,5-dimethylhexane-2,4-dionato)zinc(II) , 2008, Acta crystallographica. Section E, Structure reports online.

[29]  J. I The Design of Experiments , 1936, Nature.

[30]  Gregor von Laszewski,et al.  Swift: Fast, Reliable, Loosely Coupled Parallel Computation , 2007, 2007 IEEE Congress on Services (Services 2007).

[31]  Edward A. Lee,et al.  Overview of the Ptolemy project , 2001 .

[32]  David Abramson,et al.  Scheduling Multiple Parameter Sweep Workflow Instances on the Grid , 2009, 2009 Fifth IEEE International Conference on e-Science.

[33]  M. Shields,et al.  Chapter 1 RESOURCE MANAGEMENT OF TRIANA P2P SERVICES , 2003 .

[34]  David Abramson,et al.  Fractional factorial design for parameter sweep experiments using Nimrod/E , 2008, Sci. Program..

[35]  David Abramson,et al.  Parameter Space Exploration Using Scientific Workflows , 2009, ICCS.

[36]  David Abramson,et al.  Applying Grid Computing to the Parameter Sweep of a Group Difference Pseudopotential , 2004, International Conference on Computational Science.

[37]  Arvind,et al.  Executing a Program on the MIT Tagged-Token Dataflow Architecture , 1990, IEEE Trans. Computers.

[38]  Simon J. Cox,et al.  Numerical Optimisation as Grid Services for Engineering Design , 2004, Journal of Grid Computing.

[39]  David Abramson,et al.  Parameter scan of an effective group difference pseudopotential using grid computing , 2009, New Generation Computing.

[40]  David Abramson,et al.  The Nimrod/G Grid Resource Broker for Economics‐Based Scheduling , 2009 .

[41]  Donald M Bers,et al.  A mathematical treatment of integrated Ca dynamics within the ventricular myocyte. , 2004, Biophysical journal.

[42]  David Abramson,et al.  Nimrod/K: Towards massively parallel dynamic Grid workflows , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[43]  David Abramson,et al.  The RMIT Data Flow Computer: A Hybrid Architecture , 1990, Comput. J..

[44]  Henri Casanova,et al.  Parameter Sweeps on the Grid with APST , 2003 .