Towards autonomic data management for staging-based coupled scientific workflows

Abstract Emerging scientific workflows running at extreme scale are composed of multiple applications that interact and exchange data at runtime. While staging-based approaches, e.g. in-situ/in-transit processing, are promising, dynamic behaviors (e.g. data volumes and distributions) in coupled applications and varying resource constraints at runtime make the efficient use of these techniques challenging. Addressing these challenges requires fundamental changes in the way that workflows are executed at runtime. Specifically, it is required to monitor the operating environment and running applications, and then adapt and tune the application behaviors and resource allocations at runtime while meeting the data management requirements and constraints. In this paper, we propose a policy-based autonomic data management (ADM) approach that can adaptively respond at runtime to dynamic data management requirements. We first formulate the schematic abstraction of this ADM approach including its conceptual model and system elements. Then, we explore the realization of ADM runtime and demonstrate how to achieve adaptations in a cross-layer manner with pre-defined autonomic policies. We also prototype our ADM approach and evaluate its performance on the Intrepid IBM-BlueGene and Titan Cray-XK7 systems using Chombo-based AMR applications and a visualization application. The experimental results demonstrate its effectiveness in meeting user defined objectives and accelerating overall scientific discovery.

[1]  Karsten Schwan,et al.  FlexIO: I/O Middleware for Location-Flexible Scientific Data Analytics , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[2]  Douglas L. Jones,et al.  GRACE-1: cross-layer adaptation for multimedia quality and battery energy , 2006, IEEE Transactions on Mobile Computing.

[3]  I-Hsin Chung,et al.  Active Harmony: Towards Automated Performance Tuning , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[4]  Kenneth Moreland,et al.  Sandia National Laboratories , 2000 .

[5]  Sang Hyuk Son,et al.  Feedback Control Real-Time Scheduling: Framework, Modeling, and Algorithms* , 2001, Real-Time Systems.

[6]  Ray W. Grout,et al.  Ultrascale Visualization In Situ Visualization for Large-Scale Combustion Simulations , 2010 .

[7]  M. Parashar,et al.  Accord: a programming framework for autonomic applications , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[8]  Fan Zhang,et al.  XpressSpace: a programming framework for coupling partitioned global address space simulation codes , 2014, Concurr. Comput. Pract. Exp..

[9]  Robert Latham,et al.  ISABELA-QA: Query-driven analytics with ISABELA-compressed extreme-scale scientific data , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[10]  Scott Klasky,et al.  DataSpaces: an interaction and coordination framework for coupled simulation workflows , 2012, HPDC '10.

[11]  Kevin Skadron,et al.  Power-aware QoS management in Web servers , 2003, RTSS 2003. 24th IEEE Real-Time Systems Symposium, 2003.

[12]  K. Shin,et al.  Performance Guarantees for Web Server End-Systems: A Control-Theoretical Approach , 2002, IEEE Trans. Parallel Distributed Syst..

[13]  Gunther H. Weber,et al.  Visualization of Scalar Adaptive Mesh Refinement Data , 2007 .

[14]  Patrick M. Widener,et al.  Efficient Data-Movement for Lightweight I/O , 2006, 2006 IEEE International Conference on Cluster Computing.

[15]  Layuan Li,et al.  Three-layer control policy for grid resource management , 2009, J. Netw. Comput. Appl..

[16]  Wu-chun Feng,et al.  A Power-Aware Run-Time System for High-Performance Computing , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[17]  Fan Zhang,et al.  Persistent Data Staging Services for Data Intensive In-situ Scientific Workflows , 2016, DIDC@HPDC.

[18]  Michael E. Papka,et al.  Toward simulation-time data analysis and I/O acceleration on leadership-class systems , 2011, 2011 IEEE Symposium on Large Data Analysis and Visualization.

[19]  Douglas L. Jones,et al.  Cross-layer adaptive video coding to reduce energy on general-purpose processors , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[20]  Manish Parashar,et al.  Addressing the petascale data challenge using in-situ analytics , 2011, PDAC '11.

[21]  Kwan-Liu Ma,et al.  In Situ Visualization at Extreme Scale: Challenges and Opportunities , 2009, IEEE Computer Graphics and Applications.

[22]  Chaoli Wang,et al.  Information Theory in Scientific Visualization , 2011, Entropy.

[23]  Chenyang Lu,et al.  Proceedings of the Fast 2002 Conference on File and Storage Technologies Aqueduct: Online Data Migration with Performance Guarantees , 2022 .

[24]  Arie Shoshani,et al.  Parallel in situ indexing for data-intensive computing , 2011, 2011 IEEE Symposium on Large Data Analysis and Visualization.

[25]  Marianne Winslett,et al.  High-level buffering for hiding periodic output cost in scientific simulations , 2006, IEEE Transactions on Parallel and Distributed Systems.

[26]  Peyman Oreizy,et al.  An architecture-based approach to self-adaptive software , 1999, IEEE Intell. Syst..