Optimizing Inter-Datacenter Tail Flow Completion Times using Best Worst-case Routing

Flow routing over inter-datacenter networks is a well-known problem where the network assigns a path to a newly arriving flow potentially according to the network conditions and the properties of the new flow. An essential system-wide performance metric for a routing algorithm is the flow completion times, which affect the performance of applications running across multiple datacenters. Current static and dynamic routing approaches do not take advantage of flow size information in routing, which is practical in a controlled environment such as inter-datacenter networks that are managed by the datacenter operators. In this paper, we discuss Best Worst-case Routing (BWR), which aims at optimizing the tail completion times of long-running flows over inter-datacenter networks with non-uniform link capacities. Since finding the path with the best worst-case completion time for a new flow is NP-Hard, we investigate two heuristics, BWRH and BWRHF, which use two different upper bounds on the worst-case completion times for routing. We evaluate BWRH and BWRHF against several real WAN topologies and multiple traffic patterns. Although BWRH better models the BWR problem, BWRH and BWRHF show negligible difference across various system-wide performance metrics, while BWRHF being significantly faster. Furthermore, we show that compared to other popular routing heuristics, BWRHF can reduce the mean and tail flow completion times by over 1.5× and 2×, respectively.