MBECN: Enabling ECN with Micro-burst Traffic in Multi-queue Data Center

Modern multi-queue data centers often use the standard Explicit Congestion Notification (ECN) scheme to achieve high network performance. However, one substantial drawback of this approach is that micro-burst traffic can cause the instantaneous queue length to exceed the ECN’s threshold, resulting in numerous mismarkings. After enduring too many mismarkings, senders may overreact, leading to severe throughput loss. As a solution to this dilemma, we propose our own adaptationthe Micro-burst ECN (MBECN) scheme-to mitigate mismarking. MBECN finds a more appropriate threshold baseline for each queue to absorb micro-bursts, based on steady-state analysis and an ideal generalized processor sharing (GPS) model. By adopting a queue-occupation-based dynamically adjusting algorithm, MBECN effectively handles packet backlog without hurting latency. Through testbed experiments, we find that MBECN improves throughput by ~20% and reduces flow completion time (FCT) by ~40%. Using large scale simulations, we find that throughput can be improved by 1.5~2.4× with DCTCP and 1.26~1.35× with ECN*. We also measure network delay and find that latency only increases by 7.36%.

[1]  Haitao Wu,et al.  Enabling ECN over Generic Packet Scheduling , 2016, CoNEXT.

[2]  Guihai Chen,et al.  Support ECN in Multi-Queue Datacenter Networks via Per-Port Marking with Selective Blindness , 2018, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[3]  David A. Maltz,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM 2010.

[4]  Hao Jiang,et al.  Why is the internet traffic bursty in short time scales? , 2005, SIGMETRICS '05.

[5]  Fengyuan Ren,et al.  ECN Marking With Micro-Burst Traffic: Problem, Analysis, and Improvement , 2018, IEEE/ACM Transactions on Networking.

[6]  Hao Jiang,et al.  Source-level IP packet bursts: causes and effects , 2003, IMC '03.

[7]  Haitao Wu,et al.  Enabling ECN in Multi-Service Multi-Queue Data Centers , 2016, NSDI.

[8]  Raj Jain,et al.  Packet Trains-Measurements and a New Model for Computer Network Traffic , 1986, IEEE J. Sel. Areas Commun..

[9]  Arvind Krishnamurthy,et al.  High-resolution measurement of data center microbursts , 2017, Internet Measurement Conference.

[10]  Geoffrey M. Voelker,et al.  Bullet trains: a study of NIC burst behavior at microsecond timescales , 2013, CoNEXT.

[11]  Fengyuan Ren,et al.  Absorbing micro-burst traffic by enhancing dynamic threshold policy of data center switches , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[12]  Abhay Parekh,et al.  A generalized processor sharing approach to flow control in integrated services networks: the single-node case , 1993, TNET.

[13]  Fengyuan Ren,et al.  Improving ECN marking scheme with micro-burst traffic in data center networks , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[14]  Xiaorui Wang,et al.  Dynamic Control of Flow Completion Time for Power Efficiency of Data Center Networks , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[15]  Kouji Hirata,et al.  AQM with multi-queue for microburst in data center networks , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[16]  Manish Jain,et al.  Effects of Interrupt Coalescence on Network Measurements , 2004, PAM.

[17]  Haitao Wu,et al.  Tuning ECN for data center networks , 2012, CoNEXT '12.

[18]  T. S. Eugene Ng,et al.  Republic: Data Multicast Meets Hybrid Rack-Level Interconnections in Data Center , 2018, 2018 IEEE 26th International Conference on Network Protocols (ICNP).

[19]  David A. Maltz,et al.  Network traffic characteristics of data centers in the wild , 2010, IMC '10.

[20]  Fengyuan Ren,et al.  Micro-Burst in Data Centers: Observations, Analysis, and Mitigations , 2018, 2018 IEEE 26th International Conference on Network Protocols (ICNP).

[21]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[22]  Amit Aggarwal,et al.  Understanding the performance of TCP pacing , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).