To share or not to share: comparing burst buffer architectures

Modern high performance computing platforms employ burst buffers to overcome the I/O bottleneck that limits the scale and efficiency of large-scale parallel computations. Currently there are two competing burst buffer architectures. One is to treat burst buffers as a dedicated shared resource, The other is to integrate burst buffer hardware into each compute node. In this paper we examine the design tradeoffs associated with local and shared, dedicated burst buffer architectures through modeling. By seeding our simulation with realistic workloads, we are able to systematically evaluate the resulting performance of both designs. Our studies validate previous results indicating that storage systems without parity protection can reduce overall time to solution, and further determine that shared burst buffer organizations can result in a 3.5× greater average application I/O throughput compared to local burst buffer configurations.

[1]  Sorin Faibish,et al.  On the Non-Suitability of Non-Volatility , 2015, HotStorage.

[2]  Christopher D. Carothers,et al.  ROSS: a high-performance, low memory, modular time warp system , 2000, PADS '00.

[3]  Teng Wang,et al.  BurstMem: A high-performance burst buffer system for scientific applications , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[4]  John Bent,et al.  Serving Data to the Lunatic Fringe: The Evolution of HPC Storage , 2016, login Usenix Mag..

[5]  John T. Daly,et al.  A higher order estimate of the optimum checkpoint interval for restart dumps , 2006, Future Gener. Comput. Syst..

[6]  Kevin Harms,et al.  Impact of Burst Buffer Architectures on Application Portability , 2016 .

[7]  Robert B. Ross,et al.  On the role of burst buffers in leadership-class storage systems , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[8]  Maya Gokhale,et al.  Integrated in-system storage architecture for high performance computing , 2012, ROSS '12.

[9]  Michael Lang,et al.  Simulating the Burst Buffer Storage Architecture on an IBM BlueGene / Q Supercomputer , 2016 .

[10]  Andrea C. Arpaci-Dusseau,et al.  Pipeline and batch sharing in grid workloads , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.