DCCast: Efficient Point to Multipoint Transfers Across Datacenters

Using multiple datacenters allows for higher availability, load balancing and reduced latency to customers of cloud services. To distribute multiple copies of data, cloud providers depend on inter-datacenter WANs that ought to be used efficiently considering their limited capacity and the ever-increasing data demands. In this paper, we focus on applications that transfer objects from one datacenter to several datacenters over dedicated inter-datacenter networks. We present DCCast, a centralized Point to Multi-Point (P2MP) algorithm that uses forwarding trees to efficiently deliver an object from a source datacenter to required destination datacenters. With low computational overhead, DCCast selects forwarding trees that minimize bandwidth usage and balance load across all links. With simulation experiments on Google's GScale network, we show that DCCast can reduce total bandwidth usage and tail Transfer Completion Times (TCT) by up to $50\%$ compared to delivering the same objects via independent point-to-point (P2P) transfers.

[1]  Vimalkumar Jeyakumar,et al.  Juggler: a practical reordering resilient network stack for datacenters , 2016, EuroSys.

[2]  Bo Li,et al.  Jetway: minimizing costs on inter-datacenter video traffic , 2012, ACM Multimedia.

[3]  Ion Stoica,et al.  Coflow: a networking abstraction for cluster applications , 2012, HotNets-XI.

[4]  Srikanth Kandula,et al.  Dynamic Pricing and Traffic Engineering for Timely Inter-Datacenter Transfers , 2016, SIGCOMM.

[5]  Adam Wierman,et al.  Is Tail-Optimal Scheduling Possible? , 2012, Oper. Res..

[6]  Sanjoy Paul,et al.  Centralized multicast , 1999, Proceedings. Seventh International Conference on Network Protocols.

[7]  A. Stolyar,et al.  LARGEST WEIGHTED DELAY FIRST SCHEDULING: LARGE DEVIATIONS AND OPTIMALITY , 2001 .

[8]  Wei Xu,et al.  Optimizing Bulk Transfers with Software-Defined Optical WAN , 2016, SIGCOMM.

[9]  Yang Yu,et al.  SSNF: Shared Datacenter Mechanism for Inter-datacenter Bulk Transfer , 2014, 2014 Second International Conference on Advanced Cloud and Big Data.

[10]  David Meyer,et al.  IANA Guidelines for IPv4 Multicast Address Assignments , 2001, RFC.

[11]  Mark Handley,et al.  How Hard Can It Be? Designing and Implementing a Deployable Multipath TCP , 2012, NSDI.

[12]  Robert Karl,et al.  Holistic configuration management at Facebook , 2015, SOSP.

[13]  Marco Mellia,et al.  Dissecting Video Server Selection Strategies in the YouTube CDN , 2011, 2011 31st International Conference on Distributed Computing Systems.

[14]  Dimitri Watel,et al.  A Practical Greedy Approximation for the Directed Steiner Tree Problem , 2014, COCOA.

[15]  Srikanth Kandula,et al.  Achieving high utilization with software-driven WAN , 2013, SIGCOMM.

[16]  Zuqing Zhu,et al.  On Fast and Coordinated Data Backup in Geo-Distributed Optical Inter-Datacenter Networks , 2015, Journal of Lightwave Technology.

[17]  Alex X. Liu,et al.  Multiple bulk data transfers scheduling among datacenters , 2014, Comput. Networks.

[18]  Srikanth Kandula,et al.  Calendaring for wide area networks , 2014, SIGCOMM.

[19]  Michael Sirivianos,et al.  Inter-datacenter bulk transfers with netstitcher , 2011, SIGCOMM.

[20]  Dana S. Richards,et al.  Steiner tree problems , 1992, Networks.

[21]  Kevin C. Almeroth,et al.  IP Multicast Applications: Challenges and Solutions , 2001, RFC.

[22]  Min Zhu,et al.  B4: experience with a globally-deployed software defined wan , 2013, SIGCOMM.

[23]  Fan Yang,et al.  Mesa: a geo-replicated online data warehouse for Google's advertising system , 2016, Commun. ACM.

[24]  Zongpeng Li,et al.  Orchestrating Bulk Data Transfers across Geo-Distributed Datacenters , 2017, IEEE Transactions on Cloud Computing.

[25]  Zhi-Li Zhang,et al.  Vivisecting YouTube: An active measurement study , 2012, 2012 Proceedings IEEE INFOCOM.

[26]  Zhi-Li Zhang,et al.  A first look at inter-data center traffic characteristics via Yahoo! datasets , 2011, 2011 Proceedings IEEE INFOCOM.

[27]  Srinivasan Seshan,et al.  A case for end system multicast , 2002, IEEE J. Sel. Areas Commun..

[28]  Jingjing Yao,et al.  Highly efficient data migration and backup for big data applications in elastic optical inter-data-center networks , 2015, IEEE Network.

[29]  Alex C. Snoeren,et al.  Inside the Social Network's (Datacenter) Network , 2015, Comput. Commun. Rev..

[30]  P. Castoldi,et al.  Experimental assessment of inter-datacenter multicast connectivity for ethernet services in flexgrid networks , 2014, 2014 The European Conference on Optical Communication (ECOC).

[31]  Yasuhiro Miyao,et al.  An Overlay Architecture of Global Inter-Data Center Networking for Fast Content Delivery , 2011, 2011 IEEE International Conference on Communications (ICC).

[32]  Bobby Bhattacharjee,et al.  Scalable application layer multicast , 2002, SIGCOMM 2002.

[33]  Amin Vahdat,et al.  BwE: Flexible, Hierarchical Bandwidth Allocation for WAN Distributed Computing , 2015, Comput. Commun. Rev..

[34]  Cauligi S. Raghavendra,et al.  DCRoute: Speeding up Inter-Datacenter Traffic Allocation while Guaranteeing Deadlines , 2016, 2016 IEEE 23rd International Conference on High Performance Computing (HiPC).

[35]  R. Theodore Hofmeister,et al.  OPTICAL TECHNOLOGIES FOR DATA CENTER NETWORKS , 2022 .

[36]  Pablo Rodriguez,et al.  Delay-Tolerant Bulk Data Transfers on the Internet , 2009, IEEE/ACM Transactions on Networking.

[37]  Ming Zhang,et al.  Guaranteeing deadlines for inter-datacenter transfers , 2015, EuroSys.

[38]  Alex Zelikovsky,et al.  Tighter Bounds for Graph Steiner Tree Approximation , 2005, SIAM J. Discret. Math..

[39]  Haiying Shen,et al.  EcoFlow: An Economical and Deadline-Driven Inter-datacenter Video Flow Scheduling System , 2015, 2015 IEEE 35th International Conference on Distributed Computing Systems.

[40]  Bo Li,et al.  Postcard: Minimizing Costs on Inter-Datacenter Traffic with Store-and-Forward , 2012, 2012 32nd International Conference on Distributed Computing Systems Workshops.

[41]  Thyaga Nandagopal,et al.  Lowering Inter-datacenter Bandwidth Costs via Bulk Data Scheduling , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[42]  Ananth Balashankar,et al.  Software Defined Networking , 2019, 2019 19th International Conference on Sciences and Techniques of Automatic Control and Computer Engineering (STA).

[43]  Tim Kraska,et al.  MDCC: multi-data center consistency , 2012, EuroSys '13.