Network planning with deep reinforcement learning

Network planning is critical to the performance, reliability and cost of web services. This problem is typically formulated as an Integer Linear Programming (ILP) problem. Today's practice relies on hand-tuned heuristics from human experts to address the scalability challenge of ILP solvers. In this paper, we propose NeuroPlan, a deep reinforcement learning (RL) approach to solve the network planning problem. This problem involves multi-step decision making and cost minimization, which can be naturally cast as a deep RL problem. We develop two important domain-specific techniques. First, we use a graph neural network (GNN) and a novel domain-specific node-link transformation for state encoding, in order to handle the dynamic nature of the evolving network topology during planning decision making. Second, we leverage a two-stage hybrid approach that first uses deep RL to prune the search space and then uses an ILP solver to find the optimal solution. This approach resembles today's practice, but avoids human experts with an RL agent in the first stage. Evaluation on real topologies and setups from large production networks demonstrates that NeuroPlan scales to large topologies beyond the capability of ILP solvers, and reduces the cost by up to 17% compared to hand-tuned heuristics.

[1]  Yuandong Tian,et al.  Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning , 2016, ICLR.

[2]  Giovanni Rinaldi,et al.  A Branch-and-Cut Algorithm for the Resolution of Large-Scale Symmetric Traveling Salesman Problems , 1991, SIAM Rev..

[3]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[4]  Keang-Po Ho,et al.  Spectral efficiency limits and modulation/detection techniques for DWDM systems , 2004, IEEE Journal of Selected Topics in Quantum Electronics.

[5]  Song Guo,et al.  Resource Management at the Network Edge: A Deep Reinforcement Learning Approach , 2019, IEEE Network.

[6]  Le Song,et al.  2 Common Formulation for Greedy Algorithms on Graphs , 2018 .

[7]  Yuandong Tian,et al.  ELF OpenGo: An Analysis and Open Reimplementation of AlphaZero , 2019, ICML.

[8]  Hamed Haddadi,et al.  Deep Learning in Mobile and Wireless Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[9]  Hongzi Mao,et al.  Interpreting Deep Learning-Based Networking Systems , 2019, SIGCOMM.

[10]  Yuandong Tian,et al.  Learning to Perform Local Rewriting for Combinatorial Optimization , 2019, NeurIPS.

[11]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[12]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[13]  Jianxin Li,et al.  Large-Scale Hierarchical Text Classification with Recursively Regularized Deep Graph-CNN , 2018, WWW.

[14]  AuTO , 2018, Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication.

[15]  Mohit Tawarmalani,et al.  Robust Validation of Network Designs under Uncertain Demands and Failures , 2017, NSDI.

[16]  Samy Bengio,et al.  Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[17]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[18]  Albert Cabellos-Aparicio,et al.  Routing in optical transport networks with deep reinforcement learning , 2019, IEEE/OSA Journal of Optical Communications and Networking.

[19]  W. Hager,et al.  and s , 2019, Shallow Water Hydraulics.

[20]  P. J. Winzer,et al.  High-Spectral-Efficiency Optical Modulation Formats , 2012, Journal of Lightwave Technology.

[21]  Nikos D. Sidiropoulos,et al.  Learning to optimize: Training deep neural networks for wireless resource management , 2017, 2017 IEEE 18th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[22]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[24]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[25]  Brighten Godfrey,et al.  A Deep Reinforcement Learning Perspective on Internet Congestion Control , 2019, ICML.

[26]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[27]  Shlomo Shamai,et al.  Spectral Efficiency of CDMA with Random Spreading , 1999, IEEE Trans. Inf. Theory.

[28]  Ambuj K. Singh,et al.  Learning Heuristics over Large Graphs via Deep Reinforcement Learning , 2019, ArXiv.

[29]  Frank L. Lewis,et al.  Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2010, Autom..

[30]  G. Maier,et al.  WDM Network Design by ILP Models Based on Flow Aggregation , 2007, IEEE/ACM Transactions on Networking.

[31]  P. Alam ‘G’ , 2021, Composites Engineering: An A–Z Guide.

[32]  Takao Nishizeki,et al.  Planar Graphs: Theory and Algorithms , 1988 .

[33]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[34]  Jure Leskovec,et al.  Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation , 2018, NeurIPS.

[35]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[36]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[37]  Hongzi Mao,et al.  Learning scheduling algorithms for data processing clusters , 2018, SIGCOMM.

[38]  Sergio Verdú,et al.  Spectral efficiency in the wideband regime , 2002, IEEE Trans. Inf. Theory.

[39]  Yoshua Bengio,et al.  Hybrid Models for Learning to Branch , 2020, NeurIPS.

[40]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[41]  Yunhao Tang,et al.  Reinforcement Learning for Integer Programming: Learning to Cut , 2019, ICML.

[42]  Dafna Shahaf,et al.  Learning To Route with Deep RL , 2017 .

[43]  Monia Ghobadi,et al.  RAIL: A Case for Redundant Arrays of Inexpensive Links in Data Center Networks , 2017, NSDI.

[44]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[45]  John Wawrzynek,et al.  AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning , 2020, MLSys.

[46]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[47]  Feng Liu,et al.  AuTO: scaling deep reinforcement learning for datacenter-scale automatic traffic optimization , 2018, SIGCOMM.

[48]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[49]  P. Alam,et al.  H , 1887, High Explosives, Propellants, Pyrotechnics.

[50]  Jitendra Padhye,et al.  CrystalNet: Faithfully Emulating Large Production Networks , 2017, SOSP.

[51]  Víctor López,et al.  Multi-layer capacity planning for IP-optical networks , 2014, IEEE Communications Magazine.

[52]  Nitesh V. Chawla,et al.  Heterogeneous Graph Neural Network , 2019, KDD.

[53]  J. Mitchell Branch-and-Cut Algorithms for Combinatorial Optimization Problems , 1988 .

[54]  Li Fei-Fei,et al.  Progressive Neural Architecture Search , 2017, ECCV.

[55]  Joelle Pineau,et al.  An Actor-Critic Algorithm for Sequence Prediction , 2016, ICLR.

[56]  Albert Y. Zomaya,et al.  Intelligent VNF Orchestration and Flow Scheduling via Model-Assisted Deep Reinforcement Learning , 2020, IEEE Journal on Selected Areas in Communications.

[57]  Suman Jana,et al.  DistAI: Data-Driven Automated Invariant Learning for Distributed Protocols , 2021, OSDI.

[58]  Olivier Bonaventure,et al.  A Declarative and Expressive Approach to Control Forwarding Paths in Carrier-Grade Networks , 2015, SIGCOMM.

[59]  Elwood S. Buffa,et al.  Graph Theory with Applications , 1977 .

[60]  Mikkel Thorup,et al.  Internet traffic engineering by optimizing OSPF weights , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[61]  Xin Jin,et al.  Neural packet classification , 2019, SIGCOMM.

[62]  Hongzi Mao,et al.  Neural Adaptive Video Streaming with Pensieve , 2017, SIGCOMM.

[63]  Alexander Aiken,et al.  Beyond Data and Model Parallelism for Deep Neural Networks , 2018, SysML.

[64]  Yoshua Bengio,et al.  Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon , 2018, Eur. J. Oper. Res..

[65]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[66]  Jinyang Li,et al.  Polyjuice: High-Performance Transactions via Learned Concurrency Control , 2021, OSDI.

[67]  Krzysztof Choromanski,et al.  MLGO: a Machine Learning Guided Compiler Optimizations Framework , 2021, ArXiv.

[68]  Min Zhu,et al.  B4: experience with a globally-deployed software defined wan , 2013, SIGCOMM.

[69]  Urs Hölzle,et al.  B4: experience with a globally-deployed software defined wan , 2013, SIGCOMM.

[70]  Donald F. Towsley,et al.  On the interaction between overlay routing and underlay routing , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[71]  Louis-Martin Rousseau,et al.  Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization , 2020, AAAI.

[72]  Sham M. Kakade,et al.  A Natural Policy Gradient , 2001, NIPS.

[73]  Badrish Chandramouli,et al.  Qd-tree: Learning Data Layouts for Big Data Analytics , 2020, SIGMOD Conference.

[74]  Ramesh K. Sitaraman,et al.  RL-Cache: Learning-Based Cache Admission for Content Delivery , 2019, IEEE Journal on Selected Areas in Communications.

[75]  Evgeny Burnaev,et al.  Reinforcement Learning for Combinatorial Optimization: A Survey , 2020, ArXiv.

[76]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.