Applications of Deep Reinforcement Learning in Communications and Networking: A Survey

This paper presents a comprehensive literature review on applications of deep reinforcement learning (DRL) in communications and networking. Modern networks, e.g., Internet of Things (IoT) and unmanned aerial vehicle (UAV) networks, become more decentralized and autonomous. In such networks, network entities need to make decisions locally to maximize the network performance under uncertainty of network environment. Reinforcement learning has been efficiently used to enable the network entities to obtain the optimal policy including, e.g., decisions or actions, given their states when the state and action spaces are small. However, in complex and large-scale networks, the state and action spaces are usually large, and the reinforcement learning may not be able to find the optimal policy in reasonable time. Therefore, DRL, a combination of reinforcement learning with deep learning, has been developed to overcome the shortcomings. In this survey, we first give a tutorial of DRL from fundamental concepts to advanced models. Then, we review DRL approaches proposed to address emerging issues in communications and networking. The issues include dynamic network access, data rate control, wireless caching, data offloading, network security, and connectivity preservation which are all important to next generation networks, such as 5G and beyond. Furthermore, we present applications of DRL for traffic routing, resource sharing, and data collection. Finally, we highlight important challenges, open issues, and future research directions of applying DRL.

[1]  Yongming Huang,et al.  Cache-Enabled Dynamic Rate Allocation via Deep Self-Transfer Reinforcement Learning , 2018, ArXiv.

[2]  Yongming Huang,et al.  Cache-Enabled Adaptive Bit Rate Streaming via Deep Self-Transfer Reinforcement Learning , 2018, 2018 10th International Conference on Wireless Communications and Signal Processing (WCSP).

[3]  Nan Zhao,et al.  Integrated Networking, Caching, and Computing for Connected Vehicles: A Deep Reinforcement Learning Approach , 2018, IEEE Transactions on Vehicular Technology.

[4]  Victor C. M. Leung,et al.  Software-Defined Networks with Mobile Edge Computing and Caching for Smart Cities: A Big Data Deep Reinforcement Learning Approach , 2017, IEEE Communications Magazine.

[5]  Walid Saad,et al.  Proactive Resource Management for LTE in Unlicensed Spectrum: A Deep Learning Perspective , 2017, IEEE Transactions on Wireless Communications.

[6]  Jean-Luc Starck,et al.  Sparse Solution of Underdetermined Systems of Linear Equations by Stagewise Orthogonal Matching Pursuit , 2012, IEEE Transactions on Information Theory.

[7]  Marco Pavone,et al.  Cellular Network Traffic Scheduling With Deep Reinforcement Learning , 2018, AAAI.

[8]  Wolfgang Maass,et al.  Liquid State Machines: Motivation, Theory, and Applications , 2010 .

[9]  Zhu Han,et al.  A Survey on Applications of Model-Free Strategy Learning in Cognitive Wireless Networks , 2015, IEEE Communications Surveys & Tutorials.

[10]  Adam Wolisz,et al.  EvalVid - A Framework for Video Transmission and Quality Evaluation , 2003, Computer Performance Evaluation / TOOLS.

[11]  Mark W. Spong,et al.  Collision-Free Formation Control with Decentralized Connectivity Preservation for Nonholonomic-Wheeled Mobile Robots , 2015, IEEE Transactions on Control of Network Systems.

[12]  Bruno Sinopoli,et al.  A Control-Theoretic Approach for Dynamic Adaptive Video Streaming over HTTP , 2015, Comput. Commun. Rev..

[13]  Kay Chen Tan,et al.  Evolutionary artificial potential fields and their application in real time robot path planning , 2000, Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512).

[14]  Rose Qingyang Hu,et al.  Mobility-Aware Edge Caching and Computing in Vehicle Networks: A Deep Reinforcement Learning , 2018, IEEE Transactions on Vehicular Technology.

[15]  Walid Saad,et al.  Echo-Liquid State Deep Learning for 360° Content Transmission and Caching in Wireless VR Networks With Cellular-Connected UAVs , 2018, IEEE Transactions on Communications.

[16]  Fei-Yue Wang,et al.  An Efficient Deep Reinforcement Learning Model for Urban Traffic Control , 2018, ArXiv.

[17]  H. Vincent Poor,et al.  Two-dimensional anti-jamming communication based on deep reinforcement learning , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Liang Xiao,et al.  Anti-Jamming Power Control Game in Unmanned Aerial Vehicle Networks , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[19]  R. Bellman Dynamic programming. , 1957, Science.

[20]  Chen-Khong Tham,et al.  Quality of Service Aware Computation Offloading in an Ad-Hoc Mobile Cloud , 2018, IEEE Transactions on Vehicular Technology.

[21]  H. Vincent Poor,et al.  A Secure Mobile Crowdsensing Game With Deep Reinforcement Learning , 2018, IEEE Transactions on Information Forensics and Security.

[22]  Husheng Li Multiagent Q-Learning for Aloha-Like Spectrum Access in Cognitive Radio Systems , 2010, EURASIP J. Wirel. Commun. Netw..

[23]  Wei Yu,et al.  Optimizing User Association and Spectrum Allocation in HetNets: A Utility Perspective , 2014, IEEE Journal on Selected Areas in Communications.

[24]  Zhengyao Jiang,et al.  Cryptocurrency portfolio management with deep reinforcement learning , 2016, 2017 Intelligent Systems Conference (IntelliSys).

[25]  Xin Wang,et al.  Computation offloading for mobile edge computing: A deep learning approach , 2017, 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC).

[26]  Weihua Zhuang,et al.  Learning-Based Computation Offloading for IoT Devices With Energy Harvesting , 2017, IEEE Transactions on Vehicular Technology.

[27]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Lifeng Sun,et al.  QARC: Video Quality Aware Rate Control for Real-Time Video Streaming based on Deep Reinforcement Learning , 2018, ACM Multimedia.

[29]  Walid Saad,et al.  Deep Learning for Signal Authentication and Security in Massive Internet-of-Things Systems , 2018, IEEE Transactions on Communications.

[30]  Yang Yang,et al.  DECCO: Deep-Learning Enabled Coverage and Capacity Optimization for Massive MIMO Systems , 2018, IEEE Access.

[31]  Mérouane Debbah,et al.  On the benefits of edge caching for MIMO interference alignment , 2015, 2015 IEEE 16th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[32]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[33]  Mehdi Bennis,et al.  Optimized Computation Offloading Performance in Virtual Edge Computing Systems Via Deep Reinforcement Learning , 2018, IEEE Internet of Things Journal.

[34]  Matthew Roughan,et al.  Simplifying the synthesis of internet traffic matrices , 2005, CCRV.

[35]  Khaled Ben Letaief,et al.  Optimal Resource Allocation in Wireless Powered Communication Networks With User Cooperation , 2017, IEEE Transactions on Wireless Communications.

[36]  Yanzhen Wang,et al.  Deep Q-Learning to Preserve Connectivity in Multi-robot Systems , 2017, ICSPS 2017.

[37]  Cong Shen,et al.  A Non-Stochastic Learning Approach to Energy Efficient Mobility Management , 2016, IEEE Journal on Selected Areas in Communications.

[38]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[39]  Weidong Wang,et al.  Deep Reinforcement Learning Based Dynamic Channel Allocation Algorithm in Multibeam Satellite Systems , 2018, IEEE Access.

[40]  Qi Hao,et al.  Deep Learning for Intelligent Wireless Networks: A Comprehensive Survey , 2018, IEEE Communications Surveys & Tutorials.

[41]  François Ingelrest,et al.  SensorScope: Application-specific sensor network for environmental monitoring , 2010, TOSN.

[42]  Dongning Guo,et al.  Deep Reinforcement Learning for Distributed Dynamic Power Allocation in Wireless Networks , 2018, ArXiv.

[43]  Nei Kato,et al.  A Deep-Learning-Based Radio Resource Assignment Technique for 5G Ultra Dense Networks , 2018, IEEE Network.

[44]  Geoffrey Ye Li,et al.  Power of Deep Learning for Channel Estimation and Signal Detection in OFDM Systems , 2017, IEEE Wireless Communications Letters.

[45]  Zhu Han,et al.  Resource Management in Cloud Networking Using Economic Analysis and Pricing Models: A Survey , 2017, IEEE Communications Surveys & Tutorials.

[46]  Shangxing Wang,et al.  Deep Reinforcement Learning for Dynamic Multichannel Access , 2017 .

[47]  Tiejun Lv,et al.  Deep Q-Learning Based Dynamic Resource Allocation for Self-Powered Ultra-Dense Networks , 2018, 2018 IEEE International Conference on Communications Workshops (ICC Workshops).

[48]  Tiejun Lv,et al.  Deep reinforcement learning based computation offloading and resource allocation for MEC , 2018, 2018 IEEE Wireless Communications and Networking Conference (WCNC).

[49]  Zhi Chen,et al.  Intelligent Power Control for Spectrum Sharing in Cognitive Radios: A Deep Reinforcement Learning Approach , 2017, IEEE Access.

[50]  Federico Chiariotti,et al.  D-DASH: A Deep Q-Learning Framework for DASH Video Streaming , 2017, IEEE Transactions on Cognitive Communications and Networking.

[51]  Richard C. Reinhart,et al.  Implementation of a space communications cognitive engine , 2017, 2017 Cognitive Communications for Aerospace Applications Workshop (CCAA).

[52]  Mustafa Cenk Gursoy,et al.  A deep reinforcement learning-based framework for content caching , 2017, 2018 52nd Annual Conference on Information Sciences and Systems (CISS).

[53]  Liang Xiao,et al.  Anti-Jamming Underwater Transmission With Mobility and Learning , 2018, IEEE Communications Letters.

[54]  Fredrik Tufvesson,et al.  Deep convolutional neural networks for massive MIMO fingerprint-based positioning , 2017, 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC).

[55]  Peter Sunehag,et al.  Reinforcement Learning in Large Discrete Action Spaces , 2015, ArXiv.

[56]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[57]  Zhu Han,et al.  Applications of Economic and Pricing Models for Resource Management in 5G Wireless Networks: A Survey , 2017, IEEE Communications Surveys & Tutorials.

[58]  Mérouane Debbah,et al.  Deep Learning Power Allocation in Massive MIMO , 2018, 2018 52nd Asilomar Conference on Signals, Systems, and Computers.

[59]  Soummya Kar,et al.  Cyber-Physical Attacks With Control Objectives , 2016, IEEE Transactions on Automatic Control.

[60]  Zhijin Qin,et al.  OPPay: Design and Implementation of a Payment System for Opportunistic Data Services , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[61]  Marc G. Bellemare,et al.  A Distributional Perspective on Reinforcement Learning , 2017, ICML.

[62]  Leonard Barolli,et al.  Design and Implementation of a Simulation System Based on Deep Q-Network for Mobile Actor Node Control in Wireless Sensor and Actor Networks , 2017, 2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA).

[63]  Mérouane Debbah,et al.  User Association and Load Balancing for Massive MIMO through Deep Learning , 2018, 2018 52nd Asilomar Conference on Signals, Systems, and Computers.

[64]  Chen-Khong Tham,et al.  A deep reinforcement learning based offloading scheme in ad-hoc mobile clouds , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[65]  Haitao Wang,et al.  Deep reinforcement learning with experience replay based on SARSA , 2016, 2016 IEEE Symposium Series on Computational Intelligence (SSCI).

[66]  Rudolf Hornig,et al.  An overview of the OMNeT++ simulation environment , 2008, Simutools 2008.

[67]  Wei Zhao,et al.  Deep Reinforcement Learning for Sponsored Search Real-time Bidding , 2018, KDD.

[68]  Yanhua Zhang,et al.  A Big Data Deep Reinforcement Learning Approach to Next Generation Green Wireless Networks , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[69]  Shiwen Mao,et al.  CSI-Based Fingerprinting for Indoor Localization: A Deep Learning Approach , 2016, IEEE Transactions on Vehicular Technology.

[70]  Carsten Griwodz,et al.  Commute path bandwidth traces from 3G networks: analysis and applications , 2013, MMSys.

[71]  Christian Bauckhage,et al.  Malware Detection on Mobile Devices Using Distributed Machine Learning , 2010, 2010 20th International Conference on Pattern Recognition.

[72]  Shuguang Cui,et al.  Reinforcement Learning Based Multi-Access Control with Energy Harvesting , 2018, 2018 IEEE Global Communications Conference (GLOBECOM).

[73]  Shane Legg,et al.  Noisy Networks for Exploration , 2017, ICLR.

[74]  Sergey Levine,et al.  Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[75]  Emil Björnson,et al.  Massive MIMO Has Unlimited Capacity , 2017, IEEE Transactions on Wireless Communications.

[76]  Emil Björnson,et al.  Globally Optimal Energy-Efficient Power Control and Receiver Design in Wireless Networks , 2016, IEEE Transactions on Signal Processing.

[77]  Daniele Tarchi,et al.  Adaptive coding and modulation techniques for next generation hand-held mobile satellite communications , 2013, 2013 IEEE International Conference on Communications (ICC).

[78]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[79]  Wei Chen,et al.  Deep Learning Based Fast Multiuser Detection for Massive Machine-Type Communication , 2018, 2019 IEEE 90th Vehicular Technology Conference (VTC2019-Fall).

[80]  Liang Xiao,et al.  UAV-Aided Cellular Communications with Deep Reinforcement Learning Against Jamming , 2018, IEEE Wireless Communications.

[81]  Walid Saad,et al.  Liquid State Machine Learning for Resource Allocation in a Network of Cache-Enabled LTE-U UAVs , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[82]  Daqing Zhang,et al.  Cell Selection with Deep Reinforcement Learning in Sparse Mobile Crowdsensing , 2018, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[83]  S. Barry Cooper,et al.  Computability In Context: Computation and Logic in the Real World , 2009 .

[84]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[85]  Symeon Chatzinotas,et al.  A deep learning approach for optimizing content delivering in cache-enabled HetNet , 2017, 2017 International Symposium on Wireless Communication Systems (ISWCS).

[86]  Kobi Cohen,et al.  Deep Multi-User Reinforcement Learning for Dynamic Spectrum Access in Multichannel Wireless Networks , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[87]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[88]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[89]  Ursula Challita,et al.  Artificial Neural Networks-Based Machine Learning for Wireless Networks: A Tutorial , 2017, IEEE Communications Surveys & Tutorials.

[90]  Bhaskar Krishnamachari,et al.  Deep Reinforcement Learning for Dynamic Multichannel Access in Wireless Networks , 2018, IEEE Transactions on Cognitive Communications and Networking.

[91]  Hari Balakrishnan,et al.  TCP ex machina: computer-generated congestion control , 2013, SIGCOMM.

[92]  Yanzhen Wang,et al.  A deep reinforcement learning approach to preserve connectivity for multi-robot systems , 2017, 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI).

[93]  Ramjee Prasad,et al.  Strategies for adaptive frequency hopping in the unlicensed bands , 2006, IEEE Wireless Communications.

[94]  J. Desrosiers,et al.  BRANCH-PRICE-AND-CUT ALGORITHMS , 2011 .

[95]  Zhi Ding,et al.  Resource Allocation and Inter-Cell Interference Management for Dual-Access Small Cells , 2015, IEEE Journal on Selected Areas in Communications.

[96]  Mohsen Guizani,et al.  Semisupervised Deep Reinforcement Learning in Support of IoT and Smart City Services , 2018, IEEE Internet of Things Journal.

[97]  Alagan Anpalagan,et al.  Anti-Jamming Communications Using Spectrum Waterfall: A Deep Reinforcement Learning Approach , 2017, IEEE Communications Letters.

[98]  Liang Xiao,et al.  DQN-Based Power Control for IoT Transmission against Jamming , 2018, 2018 IEEE 87th Vehicular Technology Conference (VTC Spring).

[99]  Zhu Han,et al.  Trust-Based Social Networks with Computing, Caching and Communications: A Deep Reinforcement Learning Approach , 2020, IEEE Transactions on Network Science and Engineering.

[100]  Zhisheng Niu,et al.  DeepNap: Data-Driven Base Station Sleeping Operations Through Deep Reinforcement Learning , 2018, IEEE Internet of Things Journal.

[101]  Brian L. Evans,et al.  Deep Q-Learning for Self-Organizing Networks Fault Management and Radio Performance Improvement , 2017, 2018 52nd Asilomar Conference on Signals, Systems, and Computers.

[102]  Valentin Dalibard,et al.  Learning Runtime Parameters in Computer Systems with Delayed Experience Injection , 2016, ArXiv.

[103]  Gang Cao,et al.  AIF: An Artificial Intelligence Framework for Smart Wireless Network Management , 2018, IEEE Communications Letters.

[104]  Randy Paffenroth,et al.  Multiobjective Reinforcement Learning for Cognitive Satellite Communications Using Deep Neural Network Ensembles , 2018, IEEE Journal on Selected Areas in Communications.

[105]  Kobi Cohen,et al.  Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access , 2017, IEEE Transactions on Wireless Communications.

[106]  Will Dabney,et al.  ADAPTIVE STEP-SIZES FOR REINFORCEMENT LEARNING , 2014 .

[107]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[108]  Tao Tang,et al.  Communication-Based Train Control System Performance Optimization Using Deep Reinforcement Learning , 2017, IEEE Transactions on Vehicular Technology.

[109]  Ketan Mayer-Patel,et al.  Proceedings of the second annual ACM conference on Multimedia systems , 2011 .

[110]  Tom Schaul,et al.  Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[111]  Walid Saad,et al.  Deep Reinforcement Learning for Interference-Aware Path Planning of Cellular-Connected UAVs , 2018, 2018 IEEE International Conference on Communications (ICC).

[112]  Xiao Zhang,et al.  Autonomous navigation of UAV in large-scale unknown complex environment with deep reinforcement learning , 2017, 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[113]  Peter Stone,et al.  Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[114]  Markus Fiedler,et al.  A generic quantitative relationship between quality of experience and quality of service , 2010, IEEE Network.

[115]  Seungjoon Lee,et al.  Network function virtualization: Challenges and opportunities for innovations , 2015, IEEE Communications Magazine.

[116]  Ying Jun Zhang,et al.  DRAG: Deep Reinforcement Learning Based Base Station Activation in Heterogeneous Networks , 2018, IEEE Transactions on Mobile Computing.

[117]  Zhiyuan Xu,et al.  Model-free Control for Distributed Stream Data Processing using Deep Reinforcement Learning , 2018, Proc. VLDB Endow..

[118]  Walid Saad,et al.  Machine Learning for Wireless Networks with Artificial Intelligence: A Tutorial on Neural Networks , 2017, ArXiv.

[119]  Yuanqing Xia,et al.  Crowdsensing Game with Demand Uncertainties: A Deep Reinforcement Learning Approach , 2018, ArXiv.

[120]  Nei Kato,et al.  State-of-the-Art Deep Learning: Evolving Machine Intelligence Toward Tomorrow’s Intelligent Network Traffic Control Systems , 2017, IEEE Communications Surveys & Tutorials.

[121]  Jianye Hao,et al.  Towards Cooperation in Sequential Prisoner's Dilemmas: a Deep Multiagent Reinforcement Learning Approach , 2018, ArXiv.

[122]  Song Guo,et al.  Green Resource Allocation Based on Deep Reinforcement Learning in Content-Centric IoT , 2018, IEEE Transactions on Emerging Topics in Computing.

[123]  David Silver,et al.  Deep Reinforcement Learning from Self-Play in Imperfect-Information Games , 2016, ArXiv.

[124]  Zhu Han,et al.  Data Collection and Wireless Communication in Internet of Things (IoT) Using Economic Analysis and Pricing Models: A Survey , 2016, IEEE Communications Surveys & Tutorials.

[125]  Chunhua Wang,et al.  Machine Learning and Deep Learning Methods for Cybersecurity , 2018, IEEE Access.

[126]  Yu Zheng,et al.  U-Air: when urban air quality inference meets big data , 2013, KDD.

[127]  Guan Gui,et al.  Deep Learning for Super-Resolution Channel Estimation and DOA Estimation Based Massive MIMO System , 2018, IEEE Transactions on Vehicular Technology.

[128]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[129]  Mérouane Debbah,et al.  Online Energy-Efficient Power Control in Wireless Networks by Deep Neural Networks , 2018, 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[130]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Vol. II , 1976 .

[131]  Walid Saad,et al.  Robust Deep Reinforcement Learning for Security and Safety in Autonomous Vehicle Systems , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[132]  Walid Saad,et al.  Proactive Resource Management in LTE-U Systems: A Deep Learning Perspective , 2017, ArXiv.

[133]  Manuela M. Veloso,et al.  Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[134]  Panganamala Ramana Kumar,et al.  Dynamic Watermarking: Active Defense of Networked Cyber–Physical Systems , 2016, Proceedings of the IEEE.

[135]  F. Richard Yu,et al.  Optimization of cache-enabled opportunistic interference alignment wireless networks: A big data deep reinforcement learning approach , 2017, 2017 IEEE International Conference on Communications (ICC).

[136]  Mahesh K. Marina,et al.  Network Slicing in 5G: Survey and Challenges , 2017, IEEE Communications Magazine.

[137]  Victor C. M. Leung,et al.  Deep-Reinforcement-Learning-Based Optimization for Cache-Enabled Opportunistic Interference Alignment Wireless Networks , 2017, IEEE Transactions on Vehicular Technology.

[138]  Yuxi Li,et al.  Deep Reinforcement Learning: An Overview , 2017, ArXiv.

[139]  Husheng Li,et al.  Multiagent -Learning for Aloha-Like Spectrum Access in Cognitive Radio Systems , 2010 .

[140]  Thomas Stockhammer,et al.  Dynamic adaptive streaming over HTTP --: standards and design principles , 2011, MMSys.

[141]  Peter Reichl,et al.  The Logarithmic Nature of QoE and the Role of the Weber-Fechner Law in QoE Assessment , 2010, 2010 IEEE International Conference on Communications.

[142]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[143]  Yiyang Pei,et al.  Deep Reinforcement Learning for User Association and Resource Allocation in Heterogeneous Networks , 2018, 2018 IEEE Global Communications Conference (GLOBECOM).

[144]  Ibrahim Matta,et al.  BRITE: an approach to universal topology generation , 2001, MASCOTS 2001, Proceedings Ninth International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[145]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[146]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[147]  Xianfu Chen,et al.  Deep Reinforcement Learning for Resource Management in Network Slicing , 2018, IEEE Access.

[148]  G. Monahan State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .

[149]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[150]  F. Richard Yu,et al.  Resource Allocation in Software-Defined and Information-Centric Vehicular Networks with Mobile Edge Computing , 2017, 2017 IEEE 86th Vehicular Technology Conference (VTC-Fall).

[151]  Y. He,et al.  Cache-enabled Wireless Networks with Opportunistic Interference Alignment , 2017, ArXiv.

[152]  Chi Harold Liu,et al.  Experience-driven Networking: A Deep Reinforcement Learning based Approach , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[153]  Wei Zhao,et al.  Migration Modeling and Learning Algorithms for Containers in Fog Computing , 2019, IEEE Transactions on Services Computing.

[154]  Yonghui Song,et al.  A New Deep-Q-Learning-Based Transmission Scheduling Mechanism for the Cognitive Internet of Things , 2018, IEEE Internet of Things Journal.

[155]  Xianfu Chen,et al.  Deep Reinforcement Learning for Network Slicing , 2018, ArXiv.

[156]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[157]  Bhaskar Krishnamachari,et al.  On myopic sensing for multi-channel opportunistic access: structure, optimality, and performance , 2007, IEEE Transactions on Wireless Communications.

[158]  Jing Wang,et al.  A deep reinforcement learning based framework for power-efficient resource allocation in cloud RANs , 2017, 2017 IEEE International Conference on Communications (ICC).

[159]  Anil A. Bharath,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[160]  Wei Yu,et al.  Fractional Programming for Communication Systems—Part I: Power Control and Beamforming , 2018, IEEE Transactions on Signal Processing.

[161]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[162]  Stephen P. Boyd,et al.  A bisection method for computing the H∞ norm of a transfer matrix and related problems , 1989, Math. Control. Signals Syst..

[163]  Hongzi Mao,et al.  Neural Adaptive Video Streaming with Pensieve , 2017, SIGCOMM.

[164]  Walid Saad,et al.  Cellular-Connected UAVs over 5G: Deep Reinforcement Learning for Interference Management , 2018, ArXiv.

[165]  Yoshiaki Tanaka,et al.  A Deep Reinforcement Learning Based Approach for Cost- and Energy-Aware Multi-Flow Mobile Data Offloading , 2018, IEICE Trans. Commun..

[166]  Srikanth Kandula,et al.  Resource Management with Deep Reinforcement Learning , 2016, HotNets.

[167]  Xiaojiang Du,et al.  Reinforcement Learning Based Mobile Offloading for Cloud-Based Malware Detection , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[168]  Geoffrey Ye Li,et al.  Deep Reinforcement Learning for Resource Allocation in V2V Communications , 2017, 2018 IEEE International Conference on Communications (ICC).

[169]  Yoshiaki Tanaka,et al.  Cost- and Energy-Aware Multi-Flow Mobile Data Offloading Using Markov Decision Process , 2018, IEICE Trans. Commun..

[170]  Hado van Hasselt,et al.  Double Q-learning , 2010, NIPS.

[171]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[172]  András Lörincz,et al.  Reinforcement Learning with Echo State Networks , 2006, ICANN.

[173]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[174]  Jian Ma,et al.  Learning-Based Energy-Efficient Data Collection by Unmanned Vehicles in Smart Cities , 2018, IEEE Transactions on Industrial Informatics.

[175]  Mike McDonald,et al.  Car-following: a historical review , 1999 .

[176]  Daqing Zhang,et al.  CCS-TA: quality-guaranteed online task allocation in compressive crowdsensing , 2015, UbiComp.

[177]  Gaogang Xie,et al.  CodingCache: multipath-aware CCN cache with network coding , 2013, ICN '13.

[178]  Liang Xiao,et al.  Mobile cloud offloading for malware detections with learning , 2015, 2015 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[179]  John Fearnley,et al.  Strategy iteration algorithms for games and Markov decision processes , 2010 .

[180]  Azzedine Boukerche,et al.  Deep Reinforcement Learning (DRL)-based Resource Management in Software-Defined and Virtualized Vehicular Ad Hoc Networks , 2017, DIVANet@MSWiM.

[181]  Yue Zhang,et al.  Social behavior study under pervasive social networking based on decentralized deep reinforcement learning , 2017, J. Netw. Comput. Appl..

[182]  Albert Cabellos-Aparicio,et al.  A Deep-Reinforcement Learning Approach for Software-Defined Networking Routing Optimization , 2017, ArXiv.

[183]  Klaus Wehrle,et al.  Modeling and Tools for Network Simulation , 2010, Modeling and Tools for Network Simulation.

[184]  Weihua Zhuang,et al.  User-Centric View of Unmanned Aerial Vehicle Transmission Against Smart Attacks , 2018, IEEE Transactions on Vehicular Technology.

[185]  Shuguang Cui,et al.  Handover Control in Wireless Systems via Asynchronous Multiuser Deep Reinforcement Learning , 2018, IEEE Internet of Things Journal.

[186]  Xiang Chen,et al.  Security in Mobile Edge Caching with Reinforcement Learning , 2018, IEEE Wireless Communications.

[187]  Yiyang Pei,et al.  Deep Reinforcement Learning for User Association and Resource Allocation in Heterogeneous Cellular Networks , 2019, IEEE Transactions on Wireless Communications.

[188]  Walid Saad,et al.  Deep Learning-Based Dynamic Watermarking for Secure Signal Authentication in the Internet of Things , 2017, 2018 IEEE International Conference on Communications (ICC).

[189]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[190]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[191]  Walid Saad,et al.  Caching in the Sky: Proactive Deployment of Cache-Enabled Unmanned Aerial Vehicles for Optimized Quality-of-Experience , 2016, IEEE Journal on Selected Areas in Communications.

[192]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[193]  Xi Chen,et al.  Reinforcement learning–based QoS/QoE‐aware service function chaining in software‐driven 5G slices , 2018, Trans. Emerg. Telecommun. Technol..

[194]  Mohammad Bagher Menhaj,et al.  Training feedforward networks with the Marquardt algorithm , 1994, IEEE Trans. Neural Networks.

[195]  Weihua Zhuang,et al.  UAV Relay in VANETs Against Smart Jamming With Reinforcement Learning , 2018, IEEE Transactions on Vehicular Technology.

[196]  Li Quan,et al.  A Novel Two-Layered Reinforcement Learning for Task Offloading with Tradeoff between Physical Machine Utilization Rate and Delay , 2018, Future Internet.

[197]  Richard Evans,et al.  Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.

[198]  Kaishun Wu,et al.  FIFS: Fine-Grained Indoor Fingerprinting System , 2012, 2012 21st International Conference on Computer Communications and Networks (ICCCN).

[199]  F. Richard Yu,et al.  Secure Social Networks in 5G Systems with Mobile Edge Computing, Caching, and Device-to-Device Communications , 2018, IEEE Wireless Communications.

[200]  Zhi Chen,et al.  Intelligent Power Control for Spectrum Sharing: A Deep Reinforcement Learning Approach , 2017, ArXiv.

[201]  Mehdi Bennis,et al.  Performance Optimization in Mobile-Edge Computing via Deep Reinforcement Learning , 2018, 2018 IEEE 88th Vehicular Technology Conference (VTC-Fall).

[202]  Sebastian Thrun,et al.  Issues in Using Function Approximation for Reinforcement Learning , 1999 .

[203]  Dongning Guo,et al.  Multi-Agent Deep Reinforcement Learning for Dynamic Power Allocation in Wireless Networks , 2018, IEEE Journal on Selected Areas in Communications.

[204]  Catherine Rosenberg,et al.  Joint Resource Allocation and User Association for Heterogeneous Wireless Cellular Networks , 2013, IEEE Transactions on Wireless Communications.