StoreRush: An Application-Level Approach to Harvesting Idle Storage in a Best Effort Environment

Abstract For a production HPC system where storage devices are shared between multiple applications and managed in a best effort manner, contention is often a major problem leading to some storage devices being more loaded than others and causing a significant reduction in I/O throughput. In this paper, we describe our latest efforts StoreRush to resolve this practical issue at the application level without requiring modification to the file and storage system. The proposed scheme uses a two-level messaging system to harvest idle storage via re-routing I/O requests to a less congested storage location so that write performance is improved while limiting the impact on read by throttling re-routing if deemed too much. An analytical model is derived to guide the setup of optimal throttling factor. The proposed scheme is verified against production applications Pixie3D, XGC1 and QMCPack during production windows, which very well demonstrated the effectiveness (e.g., up to 1.8x improvement in write) and scalability of our approach (up to 131,072 cores).

[1]  Irfan Ahmad,et al.  BASIL: Automated IO Load Balancing Across Storage Devices , 2010, FAST.

[2]  Robert B. Ross,et al.  CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[3]  Irfan Ahmad,et al.  Pesto: online storage performance management in virtualized datacenters , 2011, SoCC.

[4]  Peter J. Varman,et al.  pClock: an arrival curve based approach for QoS guarantees in shared storage systems , 2007, SIGMETRICS '07.

[5]  Franck Cappello,et al.  Scheduling the I/O of HPC Applications Under Congestion , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[6]  Scott Klasky,et al.  Runtime I/O Re-Routing + Throttling on HPC Storage , 2013, HotStorage.

[7]  L. Chacón,et al.  A non-staggered, conservative, V×B=0' finite-volume scheme for 3D implicit extended magnetohydrodynamics in curvilinear geometries , 2004, Comput. Phys. Commun..

[8]  Karsten Schwan,et al.  Managing Variability in the IO Performance of Petascale Storage Systems , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.