The Lustre File System and 100 Gigabit Wide Area Networking: An Example Case from SC11

As part of the SCinet Research Sandbox at the IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC11), Indiana University utilized a dedicated 100 Gbps wide area network (WAN) link spanning more than 3,500 km (2,175 mi) to demonstrate the capabilities of the Lustre high performance parallel file system in a high bandwidth, high latency WAN environment. This demonstration functioned as a proof of concept and provided an opportunity to study Lustre's performance over a 100 Gbps WAN. To characterize the performance of the network and file system a series of benchmarks and tests were undertaken. These included low level iperf network tests, Lustre networking tests, file system tests with the IOR benchmark, and a suite of real-world applications reading and writing to the file system. All of the tests and benchmarks were run over a the WAN link with a latency of 50.5 ms. In this article we describe the configuration and constraints of the demonstration and focus on the key findings regarding the networking layer for this extremely high bandwidth and high latency connection. Of particular interest are the challenges presented by link aggregation for a relatively small number of high bandwidth connections, and the specifics of virtual local area network routing for 100 Gbps routing elements.

[1]  Scott Michael,et al.  A distributed workflow for an astrophysical OpenMP application: using the data capacitor over WAN to enhance productivity , 2010, HPDC '10.

[2]  Matthias S. Müller,et al.  Performance and quality of service of data and video movement over a 100 Gbps testbed , 2013, Future Gener. Comput. Syst..

[3]  Scott Michael,et al.  A study of lustre networking over a 100 gigabit wide area network with 50 milliseconds of latency , 2012, DIDC '12.

[4]  Westone,et al.  Home Page , 2004, 2022 2nd International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA).

[5]  Yu Ma,et al.  Empowering distributed workflow with the data capacitor: maximizing lustre performance across the wide area network , 2007, SOCP '07.

[6]  Scott Michael,et al.  Demonstrating Lustre over a 100Gbps wide area network of 3,500km , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[7]  Miao Zhang,et al.  Driving Software Defined Networks with XSP , 2012, 2012 IEEE International Conference on Communications (ICC).