Development of a Burst Buffer System for Data-Intensive Applications

Modern parallel filesystems such as Lustre are designed to provide high, scalable I/O bandwidth in response to growing I/O requirements; however, the bursty I/O characteristics of many data-intensive scientific applications make it difficult for back-end parallel filesystems to efficiently handle I/O requests. A burst buffer system, through which data can be temporarily buffered via high-performance storage mediums, allows for gradual flushing of data to back-end filesystems. In this paper, we explore issues surrounding the development of a burst buffer system for data-intensive scientific applications. Our initial results demonstrate that utilizing a burst buffer system on top of the Lustre filesystem shows promise for dealing with the intense I/O traffic generated by application checkpointing.

[1]  George Bosilca,et al.  The Common Communication Interface (CCI) , 2011, 2011 IEEE 19th Annual Symposium on High Performance Interconnects.

[2]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[3]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[4]  Teng Wang,et al.  BurstMem: A high-performance burst buffer system for scientific applications , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[5]  Dhabaleswar K. Panda,et al.  A 1 PB/s file system to checkpoint three million MPI tasks , 2013, HPDC.

[6]  Bo Hong,et al.  File System Workload Analysis For Large Scientific Computing Applications , 2004, MSST.

[7]  Rajeev Thakur,et al.  Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[8]  Feiyi Wang,et al.  OLCF ’ s 1 TB / s , Next-Generation Lustre File System , 2013 .