Hybrid hierarchy storage system in MilkyWay-2 supercomputer

With the rapid improvement of computation capability in high performance supercomputer system, the imbalance of performance between computation subsystem and storage subsystem has become more and more serious, especially when various big data are produced ranging from tens of gigabytes up to terabytes. To reduce this gap, large-scale storage systems need to be designed and implemented with high performance and scalability.MilkyWay-2 (TH-2) supercomputer system with peak performance 54.9 Pflops, definitely has this kind of requirement for storage system. This paper mainly introduces the storage system in MilkyWay-2 supercomputer, including the hardware architecture and the parallel file system. The storage system in MilkyWay-2 supercomputer exploits a novel hybrid hierarchy storage architecture to enable high scalability of I/O clients, I/O bandwidth and storage capacity. To fit this architecture, a user level virtualized file system, named H2FS, is designed and implemented which can cooperate local storage and shared storage together into a dynamic single namespace to optimize I/O performance in IO-intensive applications. The evaluation results show that the storage system in MilkyWay-2 supercomputer can satisfy the critical requirements in large scale supercomputer, such as performance and scalability.

[1]  Bin Zhou,et al.  Scalable Performance of the Panasas Parallel File System , 2008, FAST.

[2]  Lustre : A Scalable , High-Performance File System Cluster , 2003 .

[3]  Robert Latham,et al.  Scalable I/O forwarding framework for high-performance computing systems , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[4]  Jim Rogers Power Efficiency and Performance with ORNL's Cray XK7 Titan , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[5]  Domenico Talia,et al.  A super-peer model for resource discovery services in large-scale Grids , 2005, Future Gener. Comput. Syst..

[6]  Jie Ma,et al.  Adaptive and scalable metadata management to support a trillion files , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[7]  Karsten Schwan,et al.  Managing Variability in the IO Performance of Petascale Storage Systems , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[8]  Pete Wyckoff,et al.  File Creation Strategies in a Distributed Metadata File System , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[9]  Karsten Schwan,et al.  Adaptable, metadata rich IO methods for portable high performance IO , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[10]  Robert B. Ross,et al.  Small-file access in parallel file systems , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[11]  Robert B. Ross,et al.  On the role of burst buffers in leadership-class storage systems , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[12]  Jeffrey S. Vetter,et al.  Efficiency Evaluation of Cray XT Parallel IO Stack , 2007 .

[13]  Richard W. Watson,et al.  The parallel I/O architecture of the high-performance storage system (HPSS) , 1995, Proceedings of IEEE 14th Symposium on Mass Storage Systems.

[14]  Shuai Li,et al.  Accelerating Distributed Updates with Asynchronous Ordered Writes in a Parallel File System , 2012, 2012 IEEE International Conference on Cluster Computing.

[15]  Philip H. Carns,et al.  Using server-to-server communication in parallel file systems to simplify consistency and improve performance , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[16]  Xuejun Yang,et al.  Implementation and Evaluation of Network Interface and Message Passing Services for TianHe-1A Supercomputer , 2011, 2011 IEEE 19th Annual Symposium on High Performance Interconnects.

[17]  William Gropp,et al.  Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , 2013, HiPC 2013.

[18]  Garth Goodson NFSv4 pNFS Extensions , 2005 .

[19]  Fumiyoshi Shoji,et al.  Overview of the K computer System , 2012 .

[20]  Karsten Schwan,et al.  Exploiting Latent I/O Asynchrony in Petascale Science Applications , 2009, 2008 IEEE Fourth International Conference on eScience.

[21]  Rolf Riesen,et al.  Lightweight I/O for Scientific Applications , 2006, 2006 IEEE International Conference on Cluster Computing.

[22]  Mario Cannataro,et al.  The knowledge grid , 2003, CACM.

[23]  John Bent,et al.  Storage challenges at Los Alamos National Lab , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[24]  M. Parashar,et al.  ADIOS: Powering I/O to extreme scale computing , 2011 .

[25]  Jianwei Li,et al.  Parallel netCDF: A High-Performance Scientific I/O Interface , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[26]  Arie Shoshani,et al.  Parallel I/O, analysis, and visualization of a trillion particle simulation , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[27]  Amith R. Mamidala,et al.  IBM Blue Gene/Q system software stack , 2013, IBM J. Res. Dev..

[28]  Thomas Haynes,et al.  Network File System (NFS) Version 4 Protocol , 2003, RFC.

[29]  Bill Franks,et al.  Taming The Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics , 2012 .

[30]  Jeffrey S. Vetter,et al.  Performance characterization and optimization of parallel I/O on the Cray XT , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.