Differentiated network services for data-intensive science using application-aware SDN

Data-intensive science projects rely on scalable, high-performance, fault-tolerant protocols for transferring large-volume data over a high-bandwidth, high-delay wide area network (WAN). The commonly used protocol for WAN data distribution is the GridFTP protocol. GridFTP uses encrypted sessions for data transfers and does not exchange any information with the network-layer resulting in reduced flexibility for network management at the site-level. We propose an application-aware software-defined networking (SDN) approach for providing differentiated network services for high-energy physics projects such as Compact Muon Solenoid (CMS) and Laser Interferometer Gravitational-Wave Observatory (LIGO). We demonstrate a policy-driven approach for differentiating network traffic by exploiting application- and network-layer collaboration to achieve accurate accounting of resources used by each project. We implement two strategies, a 7-3 queuing system, and a 10-3 queuing system, and show that the 10-3 strategy provides an additional capacity improvement of 11.74% over the 7-3 strategy.

[1]  Lavanya Ramakrishnan,et al.  On-demand Overlay Networks for Large Scientific Data Transfers , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[2]  João Paulo Teixeira,et al.  The CMS experiment at the CERN LHC , 2008 .

[3]  Gabriel Antoniu,et al.  OverFlow: Multi-Site Aware Big Data Management for Scientific Workflows on Clouds , 2016, IEEE Transactions on Cloud Computing.

[4]  Gabriel Antoniu,et al.  Adaptive file management for scientific workflows on the Azure cloud , 2013, 2013 IEEE International Conference on Big Data.

[5]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[6]  Byrav Ramamurthy,et al.  SNAG : SDN-managed Network Architecture for GridFTP Transfers I , 2016 .

[7]  William E. Allcock,et al.  The Globus Striped GridFTP Framework and Server , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[8]  D. Martin Swany,et al.  Phoebus: A system for high throughput data movement , 2011, J. Parallel Distributed Comput..

[9]  William E. Allcock,et al.  The globus extensible input/output system (XIO): a protocol independent IO system for the grid , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[10]  Joshua R. Smith,et al.  LIGO: the Laser Interferometer Gravitational-Wave Observatory , 1992, Science.

[11]  Nick McKeown,et al.  OpenFlow: enabling innovation in campus networks , 2008, CCRV.

[12]  Ian T. Foster,et al.  Differentiated Scheduling of Response-Critical and Best-Effort Wide-Area Data Transfers , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[13]  Pavlin Radoslavov,et al.  ONOS: towards an open, distributed SDN OS , 2014, HotSDN.

[14]  Steven Tuecke,et al.  GridFTP: Protocol Extensions to FTP for the Grid , 2001 .

[15]  D. Martin Swany,et al.  Exploiting Network Parallelism for Improving Data Transfer Performance , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[16]  D. Martin Swany,et al.  Improving GridFTP performance using the Phoebus session layer , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[17]  Ian T. Foster Globus Toolkit Version 4: Software for Service-Oriented Systems , 2005, NPC.