PaScal - a new parallel and scalable server IO networking infrastructure for supporting global storage/file systems in large-size Linux clusters

This paper presents the design and implementation of a new I/O networking infrastructure, named PaScal (parallel and scalable I/O networking framework). PaScal is used to support high data bandwidth IP based global storage systems for large scale Linux clusters. PaScal has several unique properties. It employs (1) Multi-level switch-fabric interconnection network by combining high speed interconnects for computing inter-process communication (IPC) requirements and low-cost Gigabit Ethernet interconnect for global IP based storage/file access, (2) A bandwidth on demand scaling I/O networking architecture, (3) open-standard IP networks (routing and switching), (4) multipath routing for load balancing and failover, (5) open shortest path first (OSPF) routing software, and (6) Supporting a global file system in multi-cluster and multi-platform environments. We describe both the hardware and software components of our proposed PaScal. We have implemented the PaScal I/O infrastructure on several large-size Linux clusters at LANL. We have conducted a sequence of parallel MPI-IO assessment benchmarks on LANL's Pink 1024 node Linux cluster and the Panasas global parallel file system. Performance results from our parallel MPI-IO benchmarks on the Pink cluster demonstrate that the PaScal I/O Infrastructure is robust and capable of scaling in bandwidth on large-size Linux clusters

[1]  Jianwei Li,et al.  Parallel netCDF: A High-Performance Scientific I/O Interface , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[2]  D.A. Reed,et al.  Input/Output Characteristics of Scalable Parallel Applications , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[3]  Rodrigo Rodrigues,et al.  Proceedings of Hotos Ix: the 9th Workshop on Hot Topics in Operating Systems Hotos Ix: the 9th Workshop on Hot Topics in Operating Systems High Availability, Scalable Storage, Dynamic Peer Networks: Pick Two , 2022 .

[4]  Nagiza F. Samatova,et al.  Efficient data access for parallel BLAST , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[5]  D. Nurmi,et al.  A Case Study in Application I/O on Linux Clusters , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[6]  Terry Moore,et al.  An end-to-end approach to globally scalable network storage , 2002, SIGCOMM 2002.

[7]  Wu-chun Feng,et al.  The design, implementation, and evaluation of mpiBLAST , 2003 .

[8]  Qiaobing Xie,et al.  Stream control transmission protocol (SCTP): a reference guide , 2001 .

[9]  Tao Yang,et al.  The Panasas ActiveScale Storage Cluster - Delivering Scalable High Bandwidth Storage , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[10]  Randall R. Stewart,et al.  Stream Control Transmission Protocol , 2000, RFC.

[11]  Robert Latham,et al.  The Impact of File Systems on MPI-IO Scalability , 2004, PVM/MPI.

[12]  Rajeev Thakur,et al.  On implementing MPI-IO portably and with high performance , 1999, IOPADS '99.

[13]  Wu-chun Feng,et al.  The Quadrics Network: High-Performance Clustering Technology , 2002, IEEE Micro.

[14]  Fabrizio Petrini,et al.  Scalable collective communication on the ASCI Q machine , 2003, 11th Symposium on High Performance Interconnects, 2003. Proceedings..

[15]  Tom Shanley,et al.  Infiniband Network Architecture , 2002 .

[16]  Yun He,et al.  Data Organization and I/O in a Parallel Ocean Circulation Model , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[17]  J E Garlick,et al.  Achieving Order through CHAOS: the LLNL HPC Linux Cluster Experience , 2003 .

[18]  Renato John Recio Server I/O networks past, present, and future , 2003, NICELI '03.

[19]  Jarek Nieplocha,et al.  Active Storage Processing in a Parallel File System , 2006 .

[20]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[21]  Ethan L. Miller,et al.  Interconnection Architectures for Petabyte-Scale High-Performance Storage Systems , 2004 .

[22]  Sherali Zeadally,et al.  Stream Control Transmission Protocol (SCTP) , 2008 .

[23]  Arie Shoshani,et al.  Deep scientific computing requires deep data , 2004, IBM J. Res. Dev..

[24]  Noel Burton-Krahn,et al.  HotSwap-Transparent Server Failover for Linux , 2002, LISA.