Internet Congestion Control via Deep Reinforcement Learning

We present and investigate a novel and timely application domain for deep reinforcement learning (RL): Internet congestion control. Congestion control is the core networking task of modulating traffic sources' data-transmission rates to efficiently utilize network capacity, and is the subject of extensive attention in light of the advent of Internet services such as live video, virtual reality, Internet-of-Things, and more. We show that casting congestion control as RL enables training deep network policies that capture intricate patterns in data traffic and network conditions, and leverage this to outperform the state-of-the-art. We also highlight significant challenges facing real-world adoption of RL-based congestion control, including fairness, safety, and generalization, which are not trivial to address within conventional RL formalism. To facilitate further research and reproducibility of our results, we present a test suite for RL-guided congestion control based on the OpenAI Gym interface.

[1]  Mo Dong,et al.  PCC Vivace: Online-Learning Congestion Control , 2018, NSDI.

[2]  Katia Obraczka,et al.  Smart Congestion Control for Delay- and Disruption Tolerant Networks , 2016, 2016 13th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON).

[3]  Ibrahim Habib,et al.  Reinforcement learning-based neural network congestion controller for ATM networks , 1995, Proceedings of MILCOM '95.

[4]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[5]  Benjamin Recht,et al.  Simple random search of static linear policies is competitive for reinforcement learning , 2018, NeurIPS.

[6]  Mo Dong,et al.  PCC: Re-architecting Congestion Control for Consistent High Performance , 2014, NSDI.

[7]  Pieter Abbeel,et al.  Value Iteration Networks , 2016, NIPS.

[8]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[9]  Michael Schapira,et al.  Network-Model-Based vs. Network-Model-Free Approaches to Internet Congestion Control , 2018, 2018 IEEE 19th International Conference on High Performance Switching and Routing (HPSR).

[10]  Kao-Shing Hwang,et al.  Reinforcement learning cooperative congestion control for multimedia networks , 2005, 2005 IEEE International Conference on Information Acquisition.

[11]  Peter L. Bartlett,et al.  RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.

[12]  Mike Hibler,et al.  An integrated experimental environment for distributed systems and networks , 2002, OPSR.

[13]  Keith Winstein,et al.  Congestion-Control Throwdown , 2017, HotNets.

[14]  Xiaoli Ma,et al.  Improving TCP Congestion Control with Machine Intelligence , 2018, NetAI@SIGCOMM.

[15]  Sally Floyd,et al.  The NewReno Modification to TCP's Fast Recovery Algorithm , 2004, RFC.

[16]  Injong Rhee,et al.  CUBIC: a new TCP-friendly high-speed TCP variant , 2008, OPSR.

[17]  Regina Barzilay,et al.  Deep Transfer in Reinforcement Learning by Language Grounding , 2017, ArXiv.

[18]  Ivan Beschastnikh,et al.  Iroko: A Framework to Prototype Reinforcement Learning for Data Center Traffic Control , 2018, ArXiv.

[19]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[20]  Evan Dekker,et al.  Empirical evaluation methods for multiobjective reinforcement learning algorithms , 2011, Machine Learning.

[21]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[22]  David A. Maltz,et al.  DCTCP: Efficient Packet Transport for the Commoditized Data Center , 2010 .

[23]  Christopher Burgess,et al.  DARLA: Improving Zero-Shot Transfer in Reinforcement Learning , 2017, ICML.

[24]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[25]  Hari Balakrishnan,et al.  Copa: Practical Delay-Based Congestion Control for the Internet , 2018, ANRW.

[26]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[27]  Van Jacobson,et al.  BBR: Congestion-Based Congestion Control , 2016, ACM Queue.

[28]  Tom Schaul,et al.  Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.

[29]  Kao-Shing Hwang,et al.  A REINFORCEMENT LEARNING APPROACH TO CONGESTION CONTROL OF HIGH-SPEED MULTIMEDIA NETWORKS , 2005, Cybern. Syst..

[30]  Philip Levis,et al.  Pantheon: the training ground for Internet congestion-control research , 2018, USENIX Annual Technical Conference.

[31]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[32]  Chuan Sheng Foo,et al.  Efficient GAN-Based Anomaly Detection , 2018, ArXiv.

[33]  Hari Balakrishnan,et al.  TCP ex machina: computer-generated congestion control , 2013, SIGCOMM.

[34]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[35]  Hari Balakrishnan,et al.  Stochastic Forecasts Achieve High Throughput and Low Delay over Cellular Networks , 2013, NSDI.

[36]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[37]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[38]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[39]  Hongzi Mao,et al.  Neural Adaptive Video Streaming with Pensieve , 2017, SIGCOMM.

[40]  Srikanth Kandula,et al.  Resource Management with Deep Reinforcement Learning , 2016, HotNets.

[41]  Daphna Weinshall,et al.  Distance-based Confidence Score for Neural Network Classifiers , 2017, ArXiv.

[42]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[43]  Dafna Shahaf,et al.  Learning to Route , 2017, HotNets.

[44]  Regina Barzilay,et al.  Grounding Language for Transfer in Deep Reinforcement Learning , 2017, J. Artif. Intell. Res..

[45]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..