QFlow: A Reinforcement Learning Approach to High QoE Video Streaming over Wireless Networks

Wireless Internet access has brought legions of heterogeneous applications all sharing the same resources. However, current wireless edge networks that cater to worst or average case performance lack the agility to best serve these diverse sessions. Simultaneously, software reconfigurable infrastructure has become increasingly mainstream to the point that dynamic per packet and per flow decisions are possible at multiple layers of the communications stack. Exploiting such reconfigurability requires the design of a system that can enable a configuration, measure the impact on the application performance (Quality of Experience), and adaptively select a new configuration. Effectively, this feedback loop is a Markov Decision Process whose parameters are unknown. The goal of this work is to design, develop and demonstrate QFlow that instantiates this feedback loop as an application of reinforcement learning (RL). Our context is that of reconfigurable (priority) queueing, and we use the popular application of video streaming as our use case. We develop both model-free and model-based RL approaches that are tailored to the problem of determining which clients should be assigned to which queue at each decision period. Through experimental validation, we show how the RL-based control policies on QFlow are able to schedule the right clients for prioritization in a high-load scenario to outperform the status quo, as well as the best known solutions with over 25% improvement in QoE, and a perfect QoE score of 5 over 85% of the time.

[1]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[2]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[3]  Alan C. Bovik,et al.  Delivery quality score model for Internet video , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[4]  Andrew Sutton,et al.  {\AE}therFlow: Principled Wireless Support in SDN , 2015 .

[5]  Alexander Sprintson,et al.  Enabling Dynamic Reconfigurability of SDRs Using SDN Principles , 2016, ADHOCNETS.

[6]  Vivek S. Borkar,et al.  A Theory of QoS for Wireless , 2009, IEEE INFOCOM 2009.

[7]  R. Srikant,et al.  Stable scheduling policies for fading wireless channels , 2005, IEEE/ACM Transactions on Networking.

[8]  Nagabhushan Eswara,et al.  A Continuous QoE Evaluation Framework for Video Streaming Over HTTP , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Panganamala Ramana Kumar,et al.  Optimizing quality of experience of dynamic video streaming over fading wireless networks , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  Weichao Li,et al.  IRate: Initial Video Bitrate Selection System for HTTP Streaming , 2016, IEEE Journal on Selected Areas in Communications.

[12]  Phuoc Tran-Gia,et al.  SDN-Based Application-Aware Networking on the Example of YouTube Video Streaming , 2013, 2013 Second European Workshop on Software Defined Networks.

[13]  Xiaoqing Zhu,et al.  SDN Based QoE Optimization for HTTP-Based Adaptive Video Streaming , 2015, 2015 IEEE International Symposium on Multimedia (ISM).

[14]  Damien Saucez,et al.  From network-level measurements to expected quality of experience: The Skype use case , 2015, 2015 IEEE International Workshop on Measurements & Networking (M&N).

[15]  Eiko Yoneki,et al.  LIFT: Reinforcement Learning in Computer Systems by Learning From Demonstrations , 2018, ArXiv.

[16]  Alexander Sprintson,et al.  CrossFlow: A cross-layer architecture for SDR using SDN principles , 2015, 2015 IEEE Conference on Network Function Virtualization and Software Defined Network (NFV-SDN).

[17]  Leandros Tassiulas,et al.  Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networks , 1992 .

[18]  Alan C. Bovik,et al.  Learning a Continuous-Time Streaming Video QoE Model , 2018, IEEE Transactions on Image Processing.

[19]  Panagiotis Georgopoulos,et al.  Towards network-wide QoE fairness using openflow-assisted adaptive video streaming , 2013, FhMN@SIGCOMM.

[20]  Henning Schulzrinne,et al.  Towards QoE-aware video streaming using SDN , 2014, 2014 IEEE Global Communications Conference.

[21]  Panganamala Ramana Kumar,et al.  PULS: Processor-Supported Ultra-Low Latency Scheduling , 2018, MobiHoc.

[22]  Stefan Schmid,et al.  AeroFlux: A Near-Sighted Controller Architecture for Software-Defined Wireless Networks , 2014, ONS.

[23]  Anja Feldmann,et al.  OpenSDWN: programmatic control over home and enterprise WiFi , 2015, SOSR.

[24]  Hongzi Mao,et al.  Neural Adaptive Video Streaming with Pensieve , 2017, SIGCOMM.

[25]  Alexander Sprintson,et al.  ÆtherFlow: Principled Wireless Support in SDN , 2015, 2015 IEEE 23rd International Conference on Network Protocols (ICNP).