CLEO: Machine Learning for ECMP

In this paper, we propose CLEO, which is a machine learning approach to equal-cost multipath routing (ECMP) schemes to distribute and balance traffic. ECMP-based traffic load-balancing is widely practiced by datacenters, but hash collision resulting from skewed ECMP hashing makes it difficult to achieve the desired throughputs over paths. Various solutions have been proposed to overcome the performance degradation caused by hash collision, but most of these solutions require modifying packet headers or replacing switches. To solve this problem, CLEO builds a neural-network model that characterizes the ECMP scheme of a switch. The proof-of-concept evaluation shows that CLEO improves the root mean square error fourfold between the desired and real path throughputs.