Configtron: Tackling network diversity with heterogeneous configurations

The web serving protocol stack is constantly changing and evolving to tackle technological shifts in networking infrastructure and website complexity. As a result of this evolution, the web serving stack includes a plethora of protocols and configuration parameters that enable the web serving stack to address a variety of realistic network conditions. Yet, today, most content providers have adopted a "one-size-fits-all" approach to configuring the networking stack of their user facing web servers (or at best employ moderate tuning), despite the significant diversity in end-user networks and devices. In this paper, we revisit this problem and ask a more fundamental question: Are there benefits to tuning the network stack? If so, what system design choices and algorithmic ensembles are required to enable modern content provider to dynamically and flexibly tune their protocol stacks. We demonstrate through substantial empirical evidence that this "one-size-fits-all" approach results in sub-optimal performance and argue for a novel framework that extends existing CDN architectures to provide programmatic control over the configuration options of the CDN serving stack. We designed ConfigTron a data-driven framework that leverages data from all connections to identify their network characteristics and learn the optimal configuration parameters to improve end-user performance. ConfigTron uses contextual multi-arm bandit-based learning algorithm to find optimal configurations in minimal time, enabling a content providers to systematically explore heterogeneous configurations while improving end-user page load time by as much as 19% (upto 750ms) on median.

[1]  Wolfgang Kellerer,et al.  Qoe-based rate adaptation scheme selection for resource-constrained wireless video transmission , 2010, ACM Multimedia.

[2]  Shijie Sun,et al.  Pytheas: Enabling Data-Driven Quality of Experience Optimization Using Group-Based Exploration-Exploitation , 2017, NSDI.

[3]  Minlan Yu,et al.  CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics , 2017, NSDI.

[4]  Shivnath Babu,et al.  Tuning Database Configuration Parameters with iTuned , 2009, Proc. VLDB Endow..

[5]  Mona Attariyan,et al.  Automating Configuration Troubleshooting with Dynamic Information Flow Analysis , 2010, OSDI.

[6]  Partha Kanuparthy,et al.  Performance Characterization of a Commercial Video Streaming Service , 2016, Internet Measurement Conference.

[7]  Amit Agarwal,et al.  An argument for increasing TCP's initial congestion window , 2010, CCRV.

[8]  Luiz André Barroso,et al.  The tail at scale , 2013, CACM.

[9]  Srikanth V. Krishnamurthy,et al.  FlexiWeb: Network-Aware Compaction for Accelerating Mobile Web Transfers , 2015, MobiCom.

[10]  Philip Levis,et al.  Pantheon: the training ground for Internet congestion-control research , 2018, USENIX Annual Technical Conference.

[11]  Mo Dong,et al.  PCC Vivace: Online-Learning Congestion Control , 2018, NSDI.

[12]  Yi Sun,et al.  CS2P: Improving Video Bitrate Selection and Adaptation with Data-Driven Throughput Prediction , 2016, SIGCOMM.

[13]  Ming Zhang,et al.  Efficiently Delivering Online Services over Integrated Infrastructure , 2016, NSDI.

[14]  Hari Balakrishnan,et al.  Polaris: Faster Page Loads Using Fine-grained Dependency Tracking , 2016, NSDI.

[15]  Dan Pei,et al.  TCP WISE: One initial congestion window is not enough , 2017, 2017 IEEE 36th International Performance Computing and Communications Conference (IPCCC).

[16]  Peter A. Dinda,et al.  Characterizing and Predicting TCP Throughput on the Wide Area Network , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[17]  R. Srikant,et al.  A tutorial on cross-layer optimization in wireless networks , 2006, IEEE Journal on Selected Areas in Communications.

[18]  Nick McKeown,et al.  Neutral Net Neutrality , 2016, SIGCOMM.

[19]  Qian Zhang,et al.  Compound TCP: A scalable and TCP-friendly congestion control for high-speed networks , 2006 .

[20]  Bruno Ribeiro,et al.  Oboe: auto-tuning video ABR algorithms to network conditions , 2018, SIGCOMM.

[21]  Jan Rüth,et al.  An Empirical View on Content Provider Fairness , 2019, 2019 Network Traffic Measurement and Analysis Conference (TMA).

[22]  Andreas Haeberlen,et al.  Dispersing Asymmetric DDoS Attacks with SplitStack , 2016, HotNets.

[23]  M. D. McKay,et al.  A comparison of three methods for selecting values of input variables in the analysis of output from a computer code , 2000 .

[24]  Mo Dong,et al.  PCC: Re-architecting Congestion Control for Consistent High Performance , 2014, NSDI.

[25]  Martina Zitterbart,et al.  Experimental evaluation of BBR congestion control , 2017, 2017 IEEE 25th International Conference on Network Protocols (ICNP).

[26]  Guangzhong Sun,et al.  Metis: Robustly Tuning Tail Latencies of Cloud Systems , 2018, USENIX Annual Technical Conference.

[27]  Michael Welzl,et al.  Can SPDY really make the web faster? , 2014, 2014 IFIP Networking Conference.

[28]  Keith Winstein,et al.  Congestion-Control Throwdown , 2017, HotNets.

[29]  Sándor Molnár,et al.  How quick is QUIC? , 2016, 2016 IEEE International Conference on Communications (ICC).

[30]  Richard J. Beckman,et al.  A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code , 2000, Technometrics.

[31]  David Wetherall,et al.  Demystifying Page Load Performance with WProf , 2013, NSDI.

[32]  Hongzi Mao,et al.  Neural Adaptive Video Streaming with Pensieve , 2017, SIGCOMM.

[33]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[34]  Vyas Sekar,et al.  CFA: A Practical Prediction System for Video QoE Optimization , 2016, NSDI.

[35]  Steve Uhlig,et al.  Interactions between Congestion Control Algorithms , 2019, 2019 Network Traffic Measurement and Analysis Conference (TMA).

[36]  Vyas Sekar,et al.  Understanding website complexity: measurements, metrics, and implications , 2011, IMC '11.

[37]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[38]  Matt J. Kusner,et al.  Bayesian Optimization with Inequality Constraints , 2014, ICML.

[39]  Usama Naseer,et al.  InspectorGadget: Inferring Network Protocol Configuration for Web Services. , 2018, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[40]  Zhenyu Zhou,et al.  A View from the Other Side: Understanding Mobile Phone Characteristics in the Developing World , 2016, Internet Measurement Conference.

[41]  Matti A. Hiltunen,et al.  A Configurable and Extensible Transport Protocol , 2007, IEEE/ACM Transactions on Networking.

[42]  Feng Qian,et al.  TCP revisited: a fresh look at TCP in the wild , 2009, IMC '09.

[43]  David Wetherall,et al.  How Speedy is SPDY? , 2014, NSDI.

[44]  Donald F. Towsley,et al.  TCP-aware resource allocation in CDMA networks , 2006, MobiCom '06.

[45]  Marco Canini,et al.  Towards automatic parameter tuning of stream processing systems , 2017, SoCC.

[46]  Waleed Meleis,et al.  QTCP: Adaptive Congestion Control with Reinforcement Learning , 2019, IEEE Transactions on Network Science and Engineering.

[47]  Srinivasan Seshan,et al.  Analyzing stability in wide-area network performance , 1997, SIGMETRICS '97.

[48]  Sally Floyd,et al.  HighSpeed TCP for Large Congestion Windows , 2003, RFC.

[49]  Van Jacobson,et al.  BBR: Congestion-Based Congestion Control , 2016, ACM Queue.

[50]  Jan Rüth,et al.  Large-scale scanning of TCP's initial window , 2017, Internet Measurement Conference.

[51]  Srinivasan Seshan,et al.  An integrated congestion management architecture for Internet hosts , 1999, SIGCOMM '99.

[52]  Neil Spring,et al.  Identifying and Aggregating Homogeneous IPv4 /24 Blocks with Hobbit , 2016, Internet Measurement Conference.

[53]  Martin Pelikan,et al.  Bayesian Optimization Algorithm , 2005 .

[54]  M. Stein Large sample properties of simulations using latin hypercube sampling , 1987 .

[55]  Sugih Jamin,et al.  AP-Atoms: A High-Accuracy Data-Driven Client Aggregation for Global Load Balancing , 2018, IEEE/ACM Transactions on Networking.

[56]  Ivan Beschastnikh,et al.  Iroko: A Framework to Prototype Reinforcement Learning for Data Center Traffic Control , 2018, ArXiv.

[57]  Guizhong Liu,et al.  Cross-layer optimization for multiuser video streaming over wireless networks , 2008 .

[58]  Srinivasan Seshan,et al.  Practical, Real-time Centralized Control for CDN-based Live Video Delivery , 2015, SIGCOMM.

[59]  Wonho Kim,et al.  Kraken: Leveraging Live Traffic Tests to Identify and Resolve Resource Utilization Bottlenecks in Large Scale Web Services , 2016, OSDI.

[60]  Prateek Mittal,et al.  RAPTOR: Routing Attacks on Privacy in Tor , 2015, USENIX Security Symposium.

[61]  Geoffrey J. Gordon,et al.  Automatic Database Management System Tuning Through Large-scale Machine Learning , 2017, SIGMOD Conference.

[62]  Jitender S. Deogun,et al.  TCP Congestion Avoidance Algorithm Identification , 2011, ICDCS 2011.

[63]  Aruna Balasubramanian,et al.  An In-depth Study of Mobile Browser Performance , 2016, WWW.

[64]  Nick Feamster,et al.  Home Network or Access Link? Locating Last-Mile Downstream Throughput Bottlenecks , 2016, PAM.

[65]  Yin Zhang,et al.  On the constancy of internet path properties , 2001, IMW '01.

[66]  Liang Dong,et al.  Starfish: A Self-tuning System for Big Data Analytics , 2011, CIDR.

[67]  Mona Attariyan,et al.  AutoBash: improving configuration management with operating system causality analysis , 2007, SOSP.

[68]  Bu-Sung Lee,et al.  TCP Performance in Mobile Ad Hoc Networks Connected to the Internet , 2007 .

[69]  Khaled Elmeleegy,et al.  Overclocking the Yahoo!: CDN for faster web page loads , 2011, IMC '11.

[70]  Yuqing Zhu,et al.  BestConfig: tapping the performance potential of systems via automatic configuration tuning , 2017, SoCC.

[71]  Zhe Wu,et al.  Klotski: Reprioritizing Web Content to Improve User Experience on Mobile Devices , 2015, NSDI.

[72]  Hari Balakrishnan,et al.  TCP ex machina: computer-generated congestion control , 2013, SIGCOMM.

[73]  D. Goldberg,et al.  BOA: the Bayesian optimization algorithm , 1999 .

[74]  Monia Ghobadi,et al.  Rethinking end-to-end congestion control in software-defined networks , 2012, HotNets-XI.

[75]  Brighten Godfrey,et al.  Internet Congestion Control via Deep Reinforcement Learning , 2018, ArXiv.

[76]  Guillaume Urvoy-Keller,et al.  On the Stationarity of TCP Bulk Data Transfers , 2005, PAM.

[77]  Hari Balakrishnan,et al.  Mahimahi: Accurate Record-and-Replay for HTTP , 2015, USENIX Annual Technical Conference.

[78]  Max Mühlhäuser,et al.  Beyond the core: Enabling software-defined control at the network edge , 2017, 2017 International Conference on Networked Systems (NetSys).

[79]  Ryan P. Adams,et al.  Bayesian Online Changepoint Detection , 2007, 0710.3742.

[80]  Eunyoung Jeong,et al.  mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems , 2014, NSDI.

[81]  David Wetherall,et al.  Speeding up Web Page Loads with Shandian , 2016, NSDI.

[82]  Olivier Bonaventure,et al.  Beyond socket options: making the Linux TCP stack truly extensible , 2019, 2019 IFIP Networking Conference (IFIP Networking).

[83]  Harsha V. Madhyastha,et al.  Vroom: Accelerating the Mobile Web with Server-Aided Dependency Resolution , 2017, SIGCOMM.

[84]  kc claffy,et al.  Analysis of RouteViews BGP data: policy atoms , 2001 .