Supporting dynamic parameter sweep in adaptive and user-steered workflow

Large-scale experiments in computational science are complex to manage. Due to its exploratory nature, several iterations evaluate a large space of parameter combinations. Scientists analyze partial results and dynamically interfere on the next steps of the simulation. Scientific workflow management systems can execute those experiments by providing process management, distributed execution and provenance data. However, supporting scientists in complex exploratory processes involving dynamic workflows is still a challenge. Features, such as user steering on workflows to track, evaluate and adapt the execution need to be designed to support iterative methods. We provide an approach to support dynamic parameter sweep, in which scientists can use the results obtained in a slice of the parameter space to improve the remainder of the execution. We propose new control structures to enable adaptive and user-steered workflows supporting iterative methods using dynamic mechanisms. We evaluate our approach using a proof of concept (Lanczos algorithm) workflow and the results show up to 78% of execution time saved.

[1]  Eric de Sturler,et al.  Recycling Krylov Subspaces for Sequences of Linear Systems , 2006, SIAM J. Sci. Comput..

[2]  Marta Mattoso,et al.  UNCERTAINTY QUANTIFICATION IN COMPUTATIONAL PREDICTIVE MODELS FOR FLUID DYNAMICS USING A WORKFLOW MANAGEMENT ENGINE , 2012 .

[3]  Misha Elena Kilmer,et al.  Recycling Subspace Information for Diffuse Optical Tomography , 2005, SIAM J. Sci. Comput..

[4]  Marta Mattoso,et al.  A Performance Evaluation of X-Ray Crystallography Scientific Workflow Using SciCumulus , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[5]  Beatriz Souza Leite Pires de Lima,et al.  A hybrid fuzzy/genetic algorithm for the design of offshore oil production risers , 2005 .

[6]  Joel H. Saltz,et al.  An Integrated Framework for Parameter-based Optimization of Scientific Workflows. , 2009, Proceedings of the ... International Symposium on High Performance Distributed Computing.

[7]  Hester Bijl,et al.  Uncertainty Quantification in Computational Fluid Dynamics , 2013, Lecture Notes in Computational Science and Engineering.

[8]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[9]  Geoffrey C. Fox,et al.  Examining the Challenges of Scientific Workflows , 2007, Computer.

[10]  C. Farhat,et al.  Interpolation Method for Adapting Reduced-Order Models and Application to Aeroelasticity , 2008 .

[11]  Michael T. Heath,et al.  Scientific Computing , 2018 .

[12]  Michael T. Heath,et al.  Scientific Computing: An Introductory Survey , 1996 .

[13]  Yogesh L. Simmhan,et al.  A survey of data provenance in e-science , 2005, SGMD.

[14]  Marta Mattoso,et al.  An algebraic approach for data-centric scientific workflows , 2011, Proc. VLDB Endow..

[15]  Cláudio T. Silva,et al.  VisTrails: visualization meets data management , 2006, SIGMOD Conference.

[16]  Bertram Ludäscher,et al.  Kepler: an extensible system for design and execution of scientific workflows , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[17]  Marta Mattoso,et al.  Exploring many task computing in scientific workflows , 2009, MTAGS '09.

[18]  Xiang Ma,et al.  An adaptive hierarchical sparse grid collocation algorithm for the solution of stochastic differential equations , 2009, J. Comput. Phys..

[19]  Marta Mattoso,et al.  ProtozoaDB: dynamic visualization and exploration of protozoan genomes , 2007, Nucleic Acids Res..

[20]  David Abramson,et al.  Parameter Exploration in Science and Engineering Using Many-Task Computing , 2011, IEEE Transactions on Parallel and Distributed Systems.

[21]  Simon See,et al.  Modeling and Verifying Non-DAG Workflows for Computational Grids , 2007, 2007 IEEE Congress on Services (Services 2007).

[22]  Gregor von Laszewski,et al.  Swift: Fast, Reliable, Loosely Coupled Parallel Computation , 2007, 2007 IEEE Congress on Services (Services 2007).

[23]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[24]  E. Sturler,et al.  Large‐scale topology optimization using preconditioned Krylov subspace methods with recycling , 2007 .

[25]  Alvaro L. G. A. Coutinho,et al.  Modal solution of transient heat conduction utilizing Lanczos algorithm , 1989 .