Interconnection Architectures for Petabyte-Scale High-Performance Storage Systems

As demand for storage bandwidth and capacity grows, designers have proposed the construction of petabytescale storage systems. Rather than relying upon a few very large storage arrays, these petabyte-scale systems have thousands of individual disks working together to provide aggregate storage system bandwidth exceeding 100 GB/s. However, providing this bandwidth to storage system clients becomes difficult due to limits in network technology. This paper discusses different interconnecti on topologies for large disk-based systems, drawing on previous experience from the parallel computing community. By choosing the right network, storage system designers can eliminate the need for expensive high-bandwidth communication links and provide a highly-redundant network resilient against single node failures. We analyze several di fferent topology choices and explore the tradeoffs between cost and performance. Using simulations, we uncover potential pitfalls, such as the placement of connections between the storage system network and its clients, that may arise when designing such a large system.

[1]  Charles L. Seitz,et al.  Multicomputers: message-passing concurrent computers , 1988, Computer.

[2]  Richard Wheeler,et al.  it/sfs: A Parallel File System for the CM-5 , 1993, USENIX Summer.

[3]  Andrew J. Hutton,et al.  Lustre: Building a File System for 1,000-node Clusters , 2003 .

[4]  Darrell D. E. Long,et al.  Swift/RAID: A Distributed RAID , 1994 .

[5]  Hugh Garraway Parallel Computer Architecture: A Hardware/Software Approach , 1999, IEEE Concurrency.

[6]  Ethan L. Miller,et al.  Replication under scalable hashing: a family of algorithms for scalable decentralized data distribution , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[7]  Rodney Van Meter,et al.  Network attached storage architecture , 2000, CACM.

[8]  R. Stevens Computational science experiences on the Intel Touchstone DELTA supercomputer , 1992, Digest of Papers COMPCON Spring 1992.

[9]  Randy H. Katz,et al.  RAMA: An Easy-to-Use, High-Performance Parallel File System , 1997, Parallel Comput..

[10]  David Kotz,et al.  The galley parallel file system , 1996, ICS '96.

[11]  Dror G. Feitelson,et al.  The Vesta parallel file system , 1996, TOCS.

[12]  David Kotz,et al.  Performance of the Galley Parallel File System , 1996 .

[13]  Darrell D. E. Long,et al.  Swift/RAID: A Distributed RAID System , 1994, Comput. Syst..

[14]  Scott A. Brandt,et al.  Reliability mechanisms for very large storage systems , 2003, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings..

[15]  Arif Merchant,et al.  FAB: Enterprise Storage Systems on a Shoestring , 2003, HotOS.

[16]  Garth A. Gibson,et al.  RAID: high-performance, reliable secondary storage , 1994, CSUR.