Neural Adaptive Video Streaming with Pensieve

Client-side video players employ adaptive bitrate (ABR) algorithms to optimize user quality of experience (QoE). Despite the abundance of recently proposed schemes, state-of-the-art ABR algorithms suffer from a key limitation: they use fixed control rules based on simplified or inaccurate models of the deployment environment. As a result, existing schemes inevitably fail to achieve optimal performance across a broad set of network conditions and QoE objectives. We propose Pensieve, a system that generates ABR algorithms using reinforcement learning (RL). Pensieve trains a neural network model that selects bitrates for future video chunks based on observations collected by client video players. Pensieve does not rely on pre-programmed models or assumptions about the environment. Instead, it learns to make ABR decisions solely through observations of the resulting performance of past decisions. As a result, Pensieve automatically learns ABR algorithms that adapt to a wide range of environments and QoE metrics. We compare Pensieve to state-of-the-art ABR algorithms using trace-driven and real world experiments spanning a wide variety of network conditions, QoE metrics, and video properties. In all considered scenarios, Pensieve outperforms the best state-of-the-art scheme, with improvements in average QoE of 12%--25%. Pensieve also generalizes well, outperforming existing schemes even on networks for which it was not explicitly trained.

[1]  Yi Sun,et al.  CS2P: Improving Video Bitrate Selection and Adaptation with Data-Driven Throughput Prediction , 2016, SIGCOMM.

[2]  Godred Fairhurst,et al.  Updating TCP to Support Rate-Limited Traffic , 2015, RFC.

[3]  Xin Jin,et al.  Can Accurate Predictions Improve Video Streaming in Cellular Networks? , 2015, HotMobile.

[4]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[5]  John N. Tsitsiklis,et al.  Actor-Critic Algorithms , 1999, NIPS.

[6]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[7]  Srikanth Kandula,et al.  Resource Management with Deep Reinforcement Learning , 2016, HotNets.

[8]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[9]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[10]  Vern Paxson,et al.  TCP Congestion Control , 1999, RFC.

[11]  Mark Handley,et al.  TCP Congestion Window Validation , 2000, RFC.

[12]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[13]  Filip De Turck,et al.  Design and optimisation of a (FA)Q-learning-based HTTP adaptive streaming client , 2014, Connect. Sci..

[14]  Tom Schaul,et al.  FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.

[15]  Hari Balakrishnan,et al.  Mahimahi: Accurate Record-and-Replay for HTTP , 2015, USENIX Annual Technical Conference.

[16]  Federico Chiariotti,et al.  Online learning adaptation strategy for DASH clients , 2016, MMSys.

[17]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[18]  Nick McKeown,et al.  Confused, timid, and unstable: picking a video streaming rate is hard , 2012, Internet Measurement Conference.

[19]  Carsten Griwodz,et al.  Commute path bandwidth traces from 3G networks: analysis and applications , 2013, MMSys.

[20]  Yuandong Tian,et al.  Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning , 2016, ICLR.

[21]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[22]  Ali C. Begen,et al.  An experimental evaluation of rate-adaptation algorithms in adaptive streaming over HTTP , 2011, MMSys.

[23]  Kris Vanhecke,et al.  QoE measurement of mobile YouTube video streaming , 2010, MoViD '10.

[24]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[25]  Vyas Sekar,et al.  Improving fairness, efficiency, and stability in HTTP-based adaptive video streaming with FESTIVE , 2012, CoNEXT '12.

[26]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[27]  Hari Balakrishnan,et al.  Stochastic Forecasts Achieve High Throughput and Low Delay over Cellular Networks , 2013, NSDI.

[28]  Panos M. Pardalos,et al.  Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[29]  Martin T. Hagan,et al.  Neural network design , 1995 .

[30]  Te-Yuan Huang,et al.  A buffer-based approach to rate adaptation: evidence from a large video streaming service , 2015, SIGCOMM 2015.

[31]  Xiapu Luo,et al.  Inferring the QoE of HTTP video streaming from user-viewing activities , 2011, W-MUST '11.

[32]  Ramesh K. Sitaraman,et al.  BOLA: Near-Optimal Bitrate Adaptation for Online Videos , 2016, IEEE/ACM Transactions on Networking.

[33]  Jean-Marie Bonnin,et al.  Quality of Experience Measurements for Video Streaming over Wireless Networks , 2009, 2009 Sixth International Conference on Information Technology: New Generations.

[34]  Tara Moayad,et al.  Proper names in the arabic translation of harry potter and the goblet of fire , 2013 .

[35]  Vyas Sekar,et al.  Understanding the impact of video quality on user engagement , 2011, SIGCOMM.

[36]  Vyas Sekar,et al.  CFA: A Practical Prediction System for Video QoE Optimization , 2016, NSDI.

[37]  Cisco Visual Networking Index: Forecast and Methodology 2016-2021.(2017) http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual- networking-index-vni/complete-white-paper-c11-481360.html. High Efficiency Video Coding (HEVC) Algorithms and Architectures https://jvet.hhi.fraunhofer. , 2017 .

[38]  Tom Schaul,et al.  Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.

[39]  Filip De Turck,et al.  A learning-based algorithm for improved bandwidth-awareness of adaptive streaming clients , 2015, 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM).

[40]  Bruno Sinopoli,et al.  A Control-Theoretic Approach for Dynamic Adaptive Video Streaming over HTTP , 2015, Comput. Commun. Rev..

[41]  M. Alizadeh Neural Adaptive Video Streaming with Pensieve by Hongzi Mao , 2017 .

[42]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[43]  Ali C. Begen,et al.  Probe and Adapt: Rate Adaptation for HTTP Video Streaming At Scale , 2013, IEEE Journal on Selected Areas in Communications.

[44]  Lakshminarayanan Subramanian,et al.  Adaptive Congestion Control for Unpredictable Cellular Networks , 2015, Comput. Commun. Rev..

[45]  Filip De Turck,et al.  Design of a Q-learning-based client quality selection algorithm for HTTP adaptive video streaming , 2013, ALA 2013.

[46]  Rocky K. C. Chang,et al.  Measuring the quality of experience of HTTP video streaming , 2011, 12th IFIP/IEEE International Symposium on Integrated Network Management (IM 2011) and Workshops.

[47]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[48]  Ramesh K. Sitaraman,et al.  Video Stream Quality Impacts Viewer Behavior: Inferring Causality Using Quasi-Experimental Designs , 2012, IEEE/ACM Transactions on Networking.