Toward Managing HPC Burst Buffers Effectively: Draining Strategy to Regulate Bursty I/O Behavior

HPC (high-performance computing) applications usually show bursty I/O behaviors. In order to expedite the applications, permanent storage systems are usually provisioned to serve such I/O bursts. Approaching the era of exascale computing, non-volatile RAM is introduced as burst buffers, to absorb the bursty bulk data and relax the I/O provisioning requirement of the permanent storage systems. However, without judiciously draining the burst buffers, I/O bursts are passed down to the underlying storage systems, which causes severe I/O contention issues.In order to minimize the I/O provisioning requirement and resolve the issues caused by I/O bursts, we propose a proactive draining scheme to manage the draining process of distributed node-local burst buffers. In addition, we develop an I/O provisioning model to predict the minimized I/O provisioning requirement for permanent storage systems. Evaluation results show that applying the proactive draining scheme largely relaxes the I/O provisioning requirement while preserving the I/O performance of underlying storage systems.

[1]  Hao Yang,et al.  Support for Provisioning and Configuration Decisions for Data Intensive Workflows , 2016, IEEE Transactions on Parallel and Distributed Systems.

[2]  John Shalf,et al.  Using IOR to analyze the I/O Performance for HPC Platforms , 2007 .

[3]  Ping Huang,et al.  Power-Capping Aware Checkpointing: On the Interplay Among Power-Capping, Temperature, Reliability, Performance, and Energy , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[4]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[5]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[6]  Irfan Ahmad,et al.  PARDA: Proportional Allocation of Resources for Distributed Storage Access , 2009, FAST.

[7]  Saurabh Gupta,et al.  Understanding and Exploiting Spatial Properties of System Failures on Extreme-Scale HPC Systems , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[8]  Wang Teng,et al.  An Ephemeral Burst-Buffer File System for Scientific Applications , 2016 .

[9]  Antony I. T. Rowstron,et al.  Everest: Scaling Down Peak Loads Through I/O Off-Loading , 2008, OSDI.

[10]  Bin Nie,et al.  A large-scale study of soft-errors on GPUs in the field , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[11]  Matei Ripeanu,et al.  The case for a versatile storage system , 2010, OPSR.

[12]  John Shalf,et al.  Exascale Computing Technology Challenges , 2010, VECPAR.

[13]  Saurabh Gupta,et al.  Best Practices and Lessons Learned from Deploying and Operating Large-Scale Data-Centric Parallel File Systems , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[14]  Margaret H. Wright,et al.  The opportunities and challenges of exascale computing , 2010 .

[15]  Gregory R. Ganger,et al.  The DiskSim Simulation Environment Version 4.0 Reference Manual (CMU-PDL-08-101) , 1998 .

[16]  Gregory R. Ganger,et al.  Argon: Performance Insulation for Shared Storage Servers , 2007, FAST.

[17]  Robert Latham,et al.  Understanding and improving computational science storage access through continuous characterization , 2011, MSST.

[18]  Bronis R. de Supinski,et al.  Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[19]  Samuel Lang,et al.  Server-side I/O coordination for parallel file systems , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[20]  Robert Latham,et al.  Leveraging burst buffer coordination to prevent I/O interference , 2016, 2016 IEEE 12th International Conference on e-Science (e-Science).

[21]  Scott Klasky,et al.  Characterizing output bottlenecks in a supercomputer , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[22]  Yang Liu,et al.  Automatic identification of application I/O signatures from noisy server-side traces , 2014, FAST.

[23]  Franck Cappello,et al.  FTI: High performance Fault Tolerance Interface for hybrid systems , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[24]  Purushotham Bangalore,et al.  Managing I/O Interference in a Shared Burst Buffer System , 2016, 2016 45th International Conference on Parallel Processing (ICPP).

[25]  Nicholas J. Wright,et al.  Architecture and Design of Cray DataWarp , 2016 .

[26]  Saurabh Gupta,et al.  Reliability lessons learned from GPU experience with the Titan supercomputer at Oak Ridge leadership computing facility , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[27]  Robert Latham,et al.  Storage Access Characteristics of Computational Science Applications , 2010 .

[28]  Peter M. Chen,et al.  Striping in a RAID level 5 disk array , 1995, SIGMETRICS '95/PERFORMANCE '95.

[29]  Robert B. Ross,et al.  On the role of burst buffers in leadership-class storage systems , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[30]  Hao Yang,et al.  Supporting storage configuration for I/O intensive workflows , 2014, ICS '14.

[31]  Feiyi Wang,et al.  OLCF ’ s 1 TB / s , Next-Generation Lustre File System , 2013 .

[32]  Randy H. Katz,et al.  An analytic performance model of disk arrays , 1993, SIGMETRICS '93.

[33]  Galen M. Shipman,et al.  The Spider Center Wide File System; From Concept to Reality , 2009 .

[34]  Luigi Carro,et al.  Understanding GPU errors on large-scale HPC systems and the implications for system design and operation , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[35]  Don E Maxwell,et al.  Monitoring Tools for Large Scale Systems , 2010 .

[36]  B R de Supinski,et al.  Detailed Modeling, Design, and Evaluation of a Scalable Multi-level Checkpointing System , 2010 .

[37]  Karsten Schwan,et al.  DataStager: scalable data staging services for petascale applications , 2009, HPDC '09.

[38]  Robert B. Ross,et al.  CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[39]  Anees Shaikh,et al.  Performance Isolation and Fairness for Multi-Tenant Cloud Storage , 2012, OSDI.

[40]  Saurabh Gupta,et al.  Lazy Checkpointing: Exploiting Temporal Locality in Failures to Mitigate Checkpointing Overheads on Extreme-Scale Systems , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[41]  Teng Wang,et al.  TRIO: Burst Buffer Based I/O Orchestration , 2015, 2015 IEEE International Conference on Cluster Computing.

[42]  Parosh Aziz Abdulla Impact of Architecture and Technology for Extreme Scale on Software and Algorithm Design , 2010 .