A Comprehensive Study of Wide Area Data Movement at a Scientific Computing Facility

Wide-area data transfer is central to distributed science. Network capacity, data movement infrastructure, and tools in science environments continuously evolve to meet the requirements of distributed-science applications. Research and education (R&E) networks such as the U.S. Department of Energy’s Energy Sciences network and Internet2 provide multiple 100 Gbps backbone networks. Large scientific facilities and research institutions have 100 Gbps wide-area network connectivity, and 10 Gbps wide-area network connectivity is common for a lot of R&E institutions. Many of these institutions employ Science DMZs, dedicated data transfer node(s), and high performance data movement tools to improve wide area data transfer performance. Large facilities may use 10 or more dedicated data transfer nodes to meet the needs of their users. In this work, we analyze various logs pertaining to wide area data transfers in and out of a large scientific facility to obtain insights on data transfer characteristics and behavior. We also show some of the inefficiencies in the state-of-the-art data movement tool and discuss approaches to address these inefficiencies.

[1]  Lavanya Ramakrishnan,et al.  On-demand Overlay Networks for Large Scientific Data Transfers , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[2]  Ian T. Foster,et al.  A Mathematical Programming- and Simulation-Based Framework to Evaluate Cyberinfrastructure Design Choices , 2017, 2017 IEEE 13th International Conference on e-Science (e-Science).

[3]  Tevfik Kosar,et al.  Prediction of Optimal Parallelism Level in Wide Area Data Transfers , 2011, IEEE Transactions on Parallel and Distributed Systems.

[4]  Prasanna Balaprakash,et al.  Explaining Wide Area Data Transfer Performance , 2017, HPDC.

[5]  Ian T. Foster,et al.  Cross-geography scientific data transferring trends and behavior , 2018, HPDC.

[6]  Brian Tierney,et al.  Protocols for wide-area data-intensive applications: Design and performance issues , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[7]  Brian D. Noble,et al.  The end-to-end performance effects of parallel TCP sockets on a lossy wide-area network , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[8]  Peter B. Danzig,et al.  Characteristics of wide-area TCP/IP conversations , 1991, SIGCOMM 1991.

[9]  Tevfik Kosar,et al.  End-to-End Data-Flow Parallelism for Throughput Optimization in High-Speed Networks , 2012, Journal of Grid Computing.

[10]  Jinbang Chen Enterprise networks: Modern techniques for analysis, measurement and performance improvement , 2012 .

[11]  Ian T. Foster,et al.  Software as a service for data scientists , 2012, Commun. ACM.

[12]  Ramón Cáceres Measurements of Wide Area Internet TraffiC , 1989 .

[13]  Brian Tierney,et al.  Efficient wide area data transfer protocols for 100 Gbps networks and beyond , 2013, NDM '13.

[14]  Ian T. Foster,et al.  Differentiated Scheduling of Response-Critical and Best-Effort Wide-Area Data Transfers , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[15]  Paul Barford,et al.  Characteristics of network traffic flow anomalies , 2001, IMW '01.

[16]  Franck Cappello,et al.  Transferring a petabyte in a day , 2018, Future Gener. Comput. Syst..

[17]  Ian Foster,et al.  GridFTP Pipelining , 2007 .

[18]  Patrick Fuhrmann,et al.  dCache, Storage System for the Future , 2006, Euro-Par.

[19]  Paul Lu,et al.  The Impact of Large-Data Transfers in Shared Wide-Area Networks: An Empirical Study , 2017, ICCS.

[20]  Ian T. Foster,et al.  End-to-end quality of service for high-end applications , 2004, Comput. Commun..

[21]  Ian T. Foster,et al.  An elegant sufficiency: load-aware differentiated scheduling of data transfers , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[22]  Wu-chun Feng,et al.  Optimizing GridFTP through dynamic right-sizing , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[23]  Venkatram Vishwanath,et al.  Toward optimizing disk-to-disk transfer on 100G networks , 2013, 2013 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS).

[24]  P. Sadayappan,et al.  Modeling and Optimizing Large-Scale Wide-Area Data Transfers , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[25]  Ian T. Foster,et al.  Toward a smart data transfer node , 2018, Future Gener. Comput. Syst..

[26]  Eli Dart,et al.  The Science DMZ: A network design pattern for data-intensive science , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[27]  Donald F. Towsley,et al.  TCP Throughput Profiles Using Measurements over Dedicated Connections , 2017, HPDC.

[28]  Tevfik Kosar,et al.  Network-aware end-to-end data throughput optimization , 2011, NDM '11.

[29]  Jon Crowcroft,et al.  Differentiated end-to-end Internet services using a weighted proportional fair sharing TCP , 1998, CCRV.

[30]  Stephen W. Poole,et al.  A technique for moving large data sets over high-performance long distance networks , 2011, 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST).

[31]  Ian T. Foster,et al.  Globus: Recent Enhancements and Future Plans , 2016, XSEDE.

[32]  William E. Allcock,et al.  The Globus Striped GridFTP Framework and Server , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[33]  Robert L. Grossman,et al.  UDT: UDP-based data transfer for high-speed wide area networks , 2007, Comput. Networks.

[34]  Guillaume Urvoy-Keller,et al.  Traffic profiling for modern enterprise networks: A case study , 2014, 2014 IEEE 20th International Workshop on Local & Metropolitan Area Networks (LANMAN).