Calibers: A bandwidth calendaring paradigm for science workflows

Abstract Many scientific workflows require large data transfers between distributed instrument facilities, storage and computing resources. To ensure that these resources are maximally utilized, R&E networks connecting these resources must ensure that an inherently unpredictable network behaves predictably. In practice, this amounts to the per-application over-provisioning of network resources in an attempt to guarantee that adequate throughput is provided to users. This often results in resource under-utilization over time. One promising solution is the use of deadlines and bandwidth calendaring. In this approach, “fair” resource allocation is replaced with deadline-based resource allocation. However, these approaches often suffer from issues in efficiently regulating resource allocation and failure modes. Therefore, our solution, Calibers, approaches bandwidth calendaring and deadline-awareness in a different way. Calibers uses shaping, metering, and pacing at the edge of the network and end-system to provide participating clients the ability to schedule bandwidth reservations without having to worry about network noise from non-participating clients. Calibers can also fail back to the fair resource allocation of underlying transport protocols if necessary. For example, if a non-participating flow somehow enters the core of the network, or a sudden network change causes the available bandwidth to be exceeded, the underlying transport protocol congestion avoidance implementation will be able to handle the congestion as it normally would. Furthermore, Calibers provides a novel simulation method and resource allocation algorithm. In this paper, we present the prototype architecture for Calibers using a central controller with distributed agents to dynamically pace flows at the ingress of the network to meet deadlines. Using Globus/Grid-FTP, we experimentally demonstrate that pacing can be used to meet data transfer deadlines which cannot be achieved using TCP. Finally, we present dynamic flow pacing algorithms that maximize acceptance ratio of flows for which deadlines can be met while maximizing network utilization. Our results show that simple heuristics, optimizing locally on the most bottlenecked link, can perform almost as well as heuristics that attempt to optimize globally.

[1]  William E. Allcock,et al.  The Globus Striped GridFTP Framework and Server , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[2]  Steven H. Low,et al.  TCP Pacing Revisited , 2022 .

[3]  Doug Leith,et al.  H-TCP: TCP Congestion Control for High Bandwidth-Delay Product Paths , 2008 .

[4]  Dorian Kcira,et al.  High speed scientific data transfers using software defined networking , 2015, INDIS '15.

[5]  Srikanth Kandula,et al.  Calendaring for wide area networks , 2015, SIGCOMM 2015.

[6]  Min Zhu,et al.  B4: experience with a globally-deployed software defined wan , 2013, SIGCOMM.

[7]  Amit Aggarwal,et al.  Understanding the performance of TCP pacing , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[8]  Van Jacobson,et al.  BBR: Congestion-Based Congestion Control , 2016, ACM Queue.

[9]  Srikanth Kandula,et al.  Achieving high utilization with software-driven WAN , 2013, SIGCOMM.

[10]  Vijay Sivaraman,et al.  Comparing edge and host traffic pacing in small buffer networks , 2015, Comput. Networks.

[11]  B. Annappa,et al.  On the Effectiveness of CoDel for Active Queue Management , 2013, 2013 Third International Conference on Advanced Computing and Communication Technologies (ACCT).

[12]  Nicola Blefari-Melazzi,et al.  Information centric networking over SDN and OpenFlow: Architectural aspects and experiments on the OFELIA testbed , 2013, Comput. Networks.

[13]  Dipak Ghosal,et al.  A model predictive control approach to flow pacing for TCP , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[14]  Takashi Watanabe,et al.  Joint Bandwidth Scheduling and Routing Method for Large File Transfer with Time Constraint and Its Implementation , 2018, IEICE Trans. Commun..

[15]  Takashi Watanabe,et al.  Joint bandwidth scheduling and routing method for large file transfer with time constraint , 2016, NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium.