论文信息 - Reinforcement Learning Meets Wireless Networks: A Layering Perspective

Reinforcement Learning Meets Wireless Networks: A Layering Perspective

Driven by the soaring traffic demand and the growing diversity of mobile services, wireless networks are evolving to be increasingly dense and heterogeneous. Accordingly, in such large-scale and complicated wireless networks, optimal controlling is reaching unprecedented levels of complexity while its traditional solutions of handcrafted offline algorithms become inefficient due to high complexity, low robustness, and high overhead. Therefore, reinforcement learning (RL), which enables network entities to learn from their actions and consequences in the interactive network environment, attracts significant attention. In this article, we comprehensively review the applications of RL in wireless networks from a layering perspective. First, we present an overview of the principle, fundamentals, and several advanced models of RL. Then, we review the up-to-date applications of RL in various functionality blocks of different network layers, ranging from the low-level physical layer to the high-level application layer. Finally, we outline a broad spectrum of challenges, open issues, and future research directions of RL-empowered wireless networks.

[1] Zhong Yang,et al. Deep Reinforcement Learning in Cache-Aided MEC Networks , 2019, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[2] Song Guo,et al. Resource Management at the Network Edge: A Deep Reinforcement Learning Approach , 2019, IEEE Network.

[3] Nei Kato,et al. Smart Resource Allocation for Mobile Edge Computing: A Deep Reinforcement Learning Approach , 2019, IEEE Transactions on Emerging Topics in Computing.

[4] Walid Saad,et al. Proactive Resource Management for LTE in Unlicensed Spectrum: A Deep Learning Perspective , 2017, IEEE Transactions on Wireless Communications.

[5] Soung Chang Liew,et al. Carrier-Sense Multiple Access for Heterogeneous Wireless Networks Using Deep Reinforcement Learning , 2019, 2019 IEEE Wireless Communications and Networking Conference Workshop (WCNCW).

[6] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[7] Rong Chen,et al. A Deep Reinforcement Learning-Based Framework for Dynamic Resource Allocation in Multibeam Satellite Systems , 2018, IEEE Communications Letters.

[8] Wanjiun Liao,et al. GreenCoMP: Energy-Aware Cooperation for Green Cellular Networks , 2017, IEEE Transactions on Mobile Computing.

[9] Victor C. M. Leung,et al. Deep-Reinforcement-Learning-Based Optimization for Cache-Enabled Opportunistic Interference Alignment Wireless Networks , 2017, IEEE Transactions on Vehicular Technology.

[10] Anja Klein,et al. An Online Context-Aware Machine Learning Algorithm for 5G mmWave Vehicular Communications , 2018, IEEE/ACM Transactions on Networking.

[11] Marco Pavone,et al. Cellular Network Traffic Scheduling With Deep Reinforcement Learning , 2018, AAAI.

[12] Victor C. M. Leung,et al. Power Control Based on Deep Reinforcement Learning for Spectrum Sharing , 2020, IEEE Transactions on Wireless Communications.

[13] Konstantinos Poularakis,et al. MACS: Deep Reinforcement Learning based SDN Controller Synchronization Policy Design , 2019, 2019 IEEE 27th International Conference on Network Protocols (ICNP).

[14] Min Chen,et al. Software-Defined Network Function Virtualization: A Survey , 2015, IEEE Access.

[15] Luliang Jia,et al. A Collaborative Multi-Agent Reinforcement Learning Anti-Jamming Algorithm in Wireless Networks , 2018, IEEE Wireless Communications Letters.

[16] F. Richard Yu,et al. Wireless Network Virtualization: A Survey, Some Research Issues and Challenges , 2015, IEEE Communications Surveys & Tutorials.

[17] Melike Erol-Kantarci,et al. AI-Enabled Future Wireless Networks: Challenges, Opportunities, and Open Issues , 2019, IEEE Vehicular Technology Magazine.

[18] Zhiyang Li,et al. Deep-Reinforcement-Learning-Based QoS-Aware Secure Routing for SDN-IoT , 2020, IEEE Internet of Things Journal.

[19] Richard Demo Souza,et al. A Survey of Machine Learning Techniques Applied to Self-Organizing Cellular Networks , 2017, IEEE Communications Surveys & Tutorials.

[20] L. Kaelbling,et al. Mobilized ad-hoc networks: a reinforcement learning approach , 2004, International Conference on Autonomic Computing, 2004. Proceedings..

[21] Andreas Mitschele-Thiel,et al. A scalable SON coordination framework for 5G , 2020, NOMS 2020 - 2020 IEEE/IFIP Network Operations and Management Symposium.

[22] V. Srinivasa Somayazulu,et al. Deep Reinforcement Learning Based Traffic- and Channel-Aware OFDMA Resource Allocation , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[23] Zheng Li,et al. Multi-Agent Deep Reinforcement Learning Based Spectrum Allocation for D2D Underlay Communications , 2019, IEEE Transactions on Vehicular Technology.

[24] Huimin Yu,et al. Deep Reinforcement Learning for Offloading and Resource Allocation in Vehicle Edge Computing and Networks , 2019, IEEE Transactions on Vehicular Technology.

[25] Emilio Calvanese Strinati,et al. Multi-Agent Deep Reinforcement Learning Based User Association for Dense mmWave Networks , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[26] Sudharman K. Jayaweera,et al. A Survey on Machine-Learning Techniques in Cognitive Radios , 2013, IEEE Communications Surveys & Tutorials.

[27] Ovidiu Iacoboaiea,et al. SON Coordination in Heterogeneous Networks: A Reinforcement Learning Framework , 2016, IEEE Transactions on Wireless Communications.

[28] Ying-Chang Liang,et al. Applications of Deep Reinforcement Learning in Communications and Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[29] Dan Pei,et al. Dynamic TCP Initial Windows and Congestion Control Schemes Through Reinforcement Learning , 2019, IEEE Journal on Selected Areas in Communications.

[30] Minyi Guo,et al. Adaptive Forwarding Delay Control for VANET Data Aggregation , 2012, IEEE Transactions on Parallel and Distributed Systems.

[31] Qisheng Wang,et al. Deep Reinforcement Learning Based Intelligent Reflecting Surface Optimization for MISO Communication Systems , 2020, IEEE Wireless Communications Letters.

[32] Xianzhong Xie,et al. An Actor-Critic Deep Reinforcement Learning Approach for Transmission Scheduling in Cognitive Internet of Things Systems , 2020, IEEE Systems Journal.

[33] Gang Feng,et al. Intelligent Resource Scheduling for 5G Radio Access Network Slicing , 2019, IEEE Transactions on Vehicular Technology.

[34] Henk Wymeersch,et al. Decentralized Scheduling for Cooperative Localization With Deep Reinforcement Learning , 2019, IEEE Transactions on Vehicular Technology.

[35] Jianan Zhao,et al. Partially Observable Double DQN Based IoT Scheduling for Energy Harvesting , 2019, 2019 IEEE International Conference on Communications Workshops (ICC Workshops).

[36] Torsten Braun,et al. A Particle Filter-Based Reinforcement Learning Approach for Reliable Wireless Indoor Positioning , 2019, IEEE Journal on Selected Areas in Communications.

[37] Junyi Li,et al. Network densification: the dominant theme for wireless evolution into 5G , 2014, IEEE Communications Magazine.

[38] Hongyang Chen,et al. Double Coded Caching in Ultra Dense Networks: Caching and Multicast Scheduling via Deep Reinforcement Learning , 2020, IEEE Transactions on Communications.

[39] Li-Chun Wang,et al. Learning-assisted beam search for indoor mmWave networks , 2018, 2018 IEEE Wireless Communications and Networking Conference Workshops (WCNCW).

[40] Miao Pan,et al. Optimal VNF Placement via Deep Reinforcement Learning in SDN/NFV-Enabled Networks , 2020, IEEE Journal on Selected Areas in Communications.

[41] X. Shen,et al. Deep Reinforcement Learning Based Resource Management for Multi-Access Edge Computing in Vehicular Networks , 2020, IEEE Transactions on Network Science and Engineering.

[42] Brighten Godfrey,et al. Internet Congestion Control via Deep Reinforcement Learning , 2018, ArXiv.

[43] Shuguang Cui,et al. Reinforcement Learning-Based Multiaccess Control and Battery Prediction With Energy Harvesting in IoT Systems , 2018, IEEE Internet of Things Journal.

[44] Kao-Shing Hwang,et al. A REINFORCEMENT LEARNING APPROACH TO CONGESTION CONTROL OF HIGH-SPEED MULTIMEDIA NETWORKS , 2005, Cybern. Syst..

[45] Bin Hu,et al. When Deep Reinforcement Learning Meets 5G-Enabled Vehicular Networks: A Distributed Offloading Framework for Traffic Big Data , 2020, IEEE Transactions on Industrial Informatics.

[46] Syed Ali Hassan,et al. Machine Learning for Resource Management in Cellular and IoT Networks: Potentials, Current Solutions, and Open Challenges , 2019, IEEE Communications Surveys & Tutorials.

[47] Ting Wang,et al. Adaptive Routing for Sensor Networks using Reinforcement Learning , 2006, The Sixth IEEE International Conference on Computer and Information Technology (CIT'06).

[48] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[49] Zhisheng Niu,et al. DeepNap: Data-Driven Base Station Sleeping Operations Through Deep Reinforcement Learning , 2018, IEEE Internet of Things Journal.

[50] Kobi Cohen,et al. Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access , 2017, IEEE Transactions on Wireless Communications.

[51] Soung Chang Liew,et al. Non-Uniform Time-Step Deep Q-Network for Carrier-Sense Multiple Access in Heterogeneous Wireless Networks , 2019, IEEE Transactions on Mobile Computing.

[52] Zhiyuan Xu,et al. Experience-Driven Congestion Control: When Multi-Path TCP Meets Deep Reinforcement Learning , 2019, IEEE Journal on Selected Areas in Communications.

[53] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[54] H. Vincent Poor,et al. Two-dimensional anti-jamming communication based on deep reinforcement learning , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[55] Anant Sahai,et al. Cooperative Multi-Agent Reinforcement Learning for Low-Level Wireless Communication , 2018, ArXiv.

[56] Chunxiao Jiang,et al. Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks , 2019, IEEE Communications Surveys & Tutorials.

[57] Lifeng Sun,et al. QARC: Video Quality Aware Rate Control for Real-Time Video Streaming based on Deep Reinforcement Learning , 2018, ACM Multimedia.

[58] Ying-Chang Liang,et al. Deep Reinforcement Learning for Modulation and Coding Scheme Selection in Cognitive HetNets , 2019, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[59] Liam Murphy,et al. Energy-Efficient QoS-Based Congestion Control for Reliable Communications in Wireless Multimedia Sensor Networks , 2018, 2018 IEEE International Conference on Communications Workshops (ICC Workshops).

[60] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[61] Ismail Güvenç,et al. Reinforcement learning for licensed-assisted access of LTE in the unlicensed spectrum , 2015, 2015 IEEE Wireless Communications and Networking Conference (WCNC).

[62] Zhifeng Zhao,et al. The LSTM-Based Advantage Actor-Critic Learning for Resource Management in Network Slicing With User Mobility , 2020, IEEE Communications Letters.

[63] Waleed Meleis,et al. QTCP: Adaptive Congestion Control with Reinforcement Learning , 2019, IEEE Transactions on Network Science and Engineering.

[64] Filip De Turck,et al. Network Function Virtualization: State-of-the-Art and Research Challenges , 2015, IEEE Communications Surveys & Tutorials.

[65] Huici Wu,et al. Deep Reinforcement Learning for Throughput Improvement of the Uplink Grant-Free NOMA System , 2020, IEEE Internet of Things Journal.

[66] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[67] Nikolaos Thomos,et al. Robust Coordinated Reinforcement Learning for MAC Design in Sensor Networks , 2019, IEEE Journal on Selected Areas in Communications.

[68] Ahmed Alkhateeb,et al. Deep Reinforcement Learning for 5G Networks: Joint Beamforming, Power Control, and Interference Coordination , 2019, IEEE Transactions on Communications.

[69] Hamed Haddadi,et al. Deep Learning in Mobile and Wireless Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[70] Truong Thu Huong,et al. A global multipath load-balanced routing algorithm based on Reinforcement Learning in SDN , 2019, 2019 International Conference on Information and Communication Technology Convergence (ICTC).

[71] Xiaofei Wang,et al. Convergence of Edge Computing and Deep Learning: A Comprehensive Survey , 2019, IEEE Communications Surveys & Tutorials.

[72] Danda B. Rawat,et al. Fusion of Software Defined Networking, Edge Computing, and Blockchain Technology for Wireless Network Virtualization , 2019, IEEE Communications Magazine.

[73] Sofie Pollin,et al. Deep Reinforcement Learning for Dynamic Network Slicing in IEEE 802.11 Networks , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[74] Giovanni Schembra,et al. Reinforcement-Learning for Management of a 5G Network Slice Extension with UAVs , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[75] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.

[76] Ying-Chang Liang,et al. The SMART Handoff Policy for Millimeter Wave Heterogeneous Cellular Networks , 2018, IEEE Transactions on Mobile Computing.

[77] Liang Xiao,et al. Reinforcement Learning-Based Downlink Interference Control for Ultra-Dense Small Cells , 2020, IEEE Transactions on Wireless Communications.

[78] Albert Y. Zomaya,et al. Intelligent VNF Orchestration and Flow Scheduling via Model-Assisted Deep Reinforcement Learning , 2020, IEEE Journal on Selected Areas in Communications.

[79] Yiyang Pei,et al. Deep Reinforcement Learning for User Association and Resource Allocation in Heterogeneous Networks , 2018, 2018 IEEE Global Communications Conference (GLOBECOM).

[80] Kai Zhang,et al. Smart Mode Selection Using Online Reinforcement Learning for VR Broadband Broadcasting in D2D Assisted 5G HetNets , 2020, IEEE Transactions on Broadcasting.

[81] Zhu Han,et al. A Survey on Applications of Model-Free Strategy Learning in Cognitive Wireless Networks , 2015, IEEE Communications Surveys & Tutorials.

[82] Robert W. Heath,et al. 5G MIMO Data for Machine Learning: Application to Beam-Selection Using Deep Learning , 2018, 2018 Information Theory and Applications Workshop (ITA).

[83] Qi Qi,et al. Dynamic Service Function Chain Embedding for NFV-Enabled IoT: A Deep Reinforcement Learning Approach , 2020, IEEE Transactions on Wireless Communications.

[84] Bo Cheng,et al. Adaptive Video Transmission Control System Based on Reinforcement Learning Approach Over Heterogeneous Networks , 2015, IEEE Transactions on Automation Science and Engineering.

[85] Javad Ghaderi,et al. Adaptive TTL-Based Caching for Content Delivery , 2017, SIGMETRICS.

[86] Ying Jun Zhang,et al. DRAG: Deep Reinforcement Learning Based Base Station Activation in Heterogeneous Networks , 2018, IEEE Transactions on Mobile Computing.

[87] Navrati Saxena,et al. Next Generation 5G Wireless Networks: A Comprehensive Survey , 2016, IEEE Communications Surveys & Tutorials.

[88] Hongzi Mao,et al. Neural Adaptive Video Streaming with Pensieve , 2017, SIGCOMM.

[89] Gang Cao,et al. AIF: An Artificial Intelligence Framework for Smart Wireless Network Management , 2018, IEEE Communications Letters.

[90] Mahesh K. Marina,et al. Network Slicing in 5G: Survey and Challenges , 2017, IEEE Communications Magazine.

[91] Gang Feng,et al. iRAF: A Deep Reinforcement Learning Approach for Collaborative Mobile Edge Computing IoT Networks , 2019, IEEE Internet of Things Journal.

[92] Peter Stone,et al. Reinforcement learning , 2019, Scholarpedia.

[93] Kaishun Wu,et al. Adaptive Online Decision Method for Initial Congestion Window in 5G Mobile Edge Computing Using Deep Reinforcement Learning , 2020, IEEE Journal on Selected Areas in Communications.

[94] Zhongxing Ming,et al. Improving the Congestion Control Performance for Mobile Networks in High-Speed Railway via Deep Reinforcement Learning , 2020, IEEE Transactions on Vehicular Technology.

[95] Y. Xu,et al. The Application of Deep Reinforcement Learning to Distributed Spectrum Access in Dynamic Heterogeneous Environments With Partial Observations , 2020, IEEE Transactions on Wireless Communications.

[96] Ying Jun Zhang,et al. Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks , 2018, IEEE Transactions on Mobile Computing.

[97] Qi Hao,et al. Deep Learning for Intelligent Wireless Networks: A Comprehensive Survey , 2018, IEEE Communications Surveys & Tutorials.

[98] Chiara Petrioli,et al. CARMA: Channel-Aware Reinforcement Learning-Based Multi-Path Adaptive Routing for Underwater Wireless Sensor Networks , 2019, IEEE Journal on Selected Areas in Communications.

[99] Li Wang,et al. Learning Radio Resource Management in 5G Networks: Framework, Opportunities and Challenges , 2016, ArXiv.

[100] Alireza Sadeghi,et al. Optimal and Scalable Caching for 5G Using Reinforcement Learning of Space-Time Popularities , 2017, IEEE Journal of Selected Topics in Signal Processing.

[101] Ismail Güvenç,et al. Context-aware mobility management in HetNets: A reinforcement learning approach , 2015, 2015 IEEE Wireless Communications and Networking Conference (WCNC).

[102] Moshe Zukerman,et al. Energy-Efficient Base-Stations Sleep-Mode Techniques in Green Cellular Networks: A Survey , 2015, IEEE Communications Surveys & Tutorials.

[103] Vahid Shah-Mansouri,et al. Deep Reinforcement Learning for Dynamic Reliability Aware NFV-Based Service Provisioning , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[104] Song Guo,et al. A Deep Reinforcement Learning Based Offloading Game in Edge Computing , 2020, IEEE Transactions on Computers.

[105] Mugen Peng,et al. A Realization of Fog-RAN Slicing via Deep Reinforcement Learning , 2020, IEEE Transactions on Wireless Communications.

[106] F. Richard Yu,et al. Decentralized Computation Offloading in IoT Fog Computing System With Energy Harvesting: A Dec-POMDP Approach , 2020, IEEE Internet of Things Journal.

[107] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[108] Meng Li,et al. Deep Reinforcement Learning-Based Offloading Decision Optimization in Mobile Edge Computing , 2019, 2019 IEEE Wireless Communications and Networking Conference (WCNC).

[109] Chaojing Xue,et al. SmartCC: A Reinforcement Learning Approach for Multipath TCP Congestion Control in Heterogeneous Networks , 2019, IEEE Journal on Selected Areas in Communications.

[110] Zhenzhen Ye,et al. Optimal Stochastic Policies for Distributed Data Aggregation in Wireless Sensor Networks , 2009, IEEE/ACM Transactions on Networking.

[111] Eleni Nisioti,et al. Fast Q-Learning for Improved Finite Length Performance of Irregular Repetition Slotted ALOHA , 2020, IEEE Transactions on Cognitive Communications and Networking.

[112] Jing Wang,et al. A deep reinforcement learning based framework for power-efficient resource allocation in cloud RANs , 2017, 2017 IEEE International Conference on Communications (ICC).

[113] Abbas Jamalipour,et al. Cooperative Caching and Transmission in CoMP-Integrated Cellular Networks Using Reinforcement Learning , 2020, IEEE Transactions on Vehicular Technology.

[114] Daniel F. Macedo,et al. Automatic Quality of Experience Management for WLAN Networks using Multi-Armed Bandit , 2019, 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM).

[115] Yong Xiang,et al. Software-Defined Wireless Networking Opportunities and Challenges for Internet-of-Things: A Review , 2016, IEEE Internet of Things Journal.

[116] Mehdi Bennis,et al. Multi-Tenant Cross-Slice Resource Orchestration: A Deep Reinforcement Learning Approach , 2018, IEEE Journal on Selected Areas in Communications.

[117] Yuan Liu,et al. A Stochastic Game Approach for Collaborative Beamforming in SDN-Based Energy Harvesting Wireless Sensor Networks , 2019, IEEE Internet of Things Journal.

[118] Chao Tian,et al. Deep Reinforcement Learning for Mobile Video Offloading in Heterogeneous Cellular Networks , 2018, Int. J. Mob. Comput. Multim. Commun..

[119] K. J. Ray Liu,et al. Near-optimal reinforcement learning framework for energy-aware sensor communications , 2005, IEEE Journal on Selected Areas in Communications.

[120] Deniz Gündüz,et al. Management and Orchestration of Virtual Network Functions via Deep Reinforcement Learning , 2019, IEEE Journal on Selected Areas in Communications.

[121] Naser El-Sheimy,et al. Deep Reinforcement Learning (DRL): Another Perspective for Unsupervised Wireless Localization , 2020, IEEE Internet of Things Journal.

[122] Dongning Guo,et al. Multi-Agent Deep Reinforcement Learning for Dynamic Power Allocation in Wireless Networks , 2018, IEEE Journal on Selected Areas in Communications.

[123] Brian L. Evans,et al. A Framework for Automated Cellular Network Tuning With Reinforcement Learning , 2018, IEEE Transactions on Communications.

[124] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[125] Sana Ben Jemaa,et al. Cognitive management of self — Organized radio networks based on multi armed bandit , 2017, 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC).

[126] Bin Hu,et al. Joint Computing and Caching in 5G-Envisioned Internet of Vehicles: A Deep Reinforcement Learning-Based Traffic Control System , 2020, IEEE Transactions on Intelligent Transportation Systems.

[127] Yue Tan,et al. Deep Reinforcement Learning for Autonomous Internet of Things: Model, Applications and Challenges , 2019, IEEE Communications Surveys & Tutorials.

[128] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[129] Yiyang Pei,et al. Deep Reinforcement Learning for User Association and Resource Allocation in Heterogeneous Cellular Networks , 2019, IEEE Transactions on Wireless Communications.

[130] Matti Latva-aho,et al. Ultra-Reliable Communication in 5G mmWave Networks: A Risk-Sensitive Approach , 2018, IEEE Communications Letters.

[131] Xianfu Chen,et al. GAN-Based Deep Distributional Reinforcement Learning for Resource Management in Network Slicing , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[132] Georges Kaddoum,et al. Managing Fog Networks using Reinforcement Learning Based Load Balancing Algorithm , 2019, 2019 IEEE Wireless Communications and Networking Conference (WCNC).

[133] Wenzhong Li,et al. ReLeS: A Neural Adaptive Multipath Scheduler based on Deep Reinforcement Learning , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[134] Xuemin Shen,et al. Delay-Aware VNF Scheduling: A Reinforcement Learning Approach With Variable Action Set , 2021, IEEE Transactions on Cognitive Communications and Networking.

[135] Yunsi Fei,et al. QELAR: A Machine-Learning-Based Adaptive Routing Protocol for Energy-Efficient and Lifetime-Extended Underwater Sensor Networks , 2010, IEEE Transactions on Mobile Computing.

[136] Katia Obraczka,et al. Smart Congestion Control for Delay- and Disruption Tolerant Networks , 2016, 2016 13th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON).

[137] Sheng-Tzong Cheng,et al. An adaptive learning scheme for load balancing with zone partition in multi-sink wireless sensor network , 2012, Expert Syst. Appl..

[138] Hwee Pink Tan,et al. Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications , 2014, IEEE Communications Surveys & Tutorials.

[139] Anja Klein,et al. FML: Fast Machine Learning for 5G mmWave Vehicular Communications , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[140] Zhi Ding,et al. Low Complexity Header Compression with Lower-Layer Awareness for Wireless Networks , 2019, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[141] Fei Hu,et al. Intelligent Spectrum Management Based on Transfer Actor-Critic Learning for Rateless Transmissions in Cognitive Radio Networks , 2018, IEEE Transactions on Mobile Computing.

[142] Eryk Dutkiewicz,et al. Optimal and Low-Complexity Dynamic Spectrum Access for RF-Powered Ambient Backscatter System With Online Reinforcement Learning , 2019, IEEE Transactions on Communications.

[143] Mohsen Guizani,et al. Semisupervised Deep Reinforcement Learning in Support of IoT and Smart City Services , 2018, IEEE Internet of Things Journal.

[144] Arumugam Nallanathan,et al. Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks , 2018, IEEE Transactions on Wireless Communications.

[145] Sana Ben Jemaa,et al. Linear UCB for Online SON Management , 2018, 2018 IEEE 87th Vehicular Technology Conference (VTC Spring).

[146] Halim Yanikomeroglu,et al. Beamforming for Maximal Coverage in mmWave Drones: A Reinforcement Learning Approach , 2020, IEEE Communications Letters.

[147] Kao-Shing Hwang,et al. Cooperative multiagent congestion control for high-speed networks , 2005, IEEE Trans. Syst. Man Cybern. Part B.

[148] Bhaskar Krishnamachari,et al. Deep Reinforcement Learning for Dynamic Multichannel Access in Wireless Networks , 2018, IEEE Transactions on Cognitive Communications and Networking.

[149] Petteri Nurmi,et al. Reinforcement Learning for Routing in Ad Hoc Networks , 2007, 2007 5th International Symposium on Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks and Workshops.

[150] Bart De Schutter,et al. Multi-agent Reinforcement Learning: An Overview , 2010 .

[151] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[152] Xia Wang,et al. Network congestion control algorithm based on Actor-Critic reinforcement learning model , 2018 .

[153] Michael N. Katehakis,et al. The Multi-Armed Bandit Problem: Decomposition and Computation , 1987, Math. Oper. Res..

[154] Vincent K. N. Lau,et al. Joint Rate and Power Optimization for Multimedia Streaming in Wireless Fading Channels via Parametric Policy Gradient , 2019, IEEE Transactions on Signal Processing.

[155] Hao Chen,et al. Self-Tuning Sectorization: Deep Reinforcement Learning Meets Broadcast Beam Optimization , 2019, IEEE Transactions on Wireless Communications.

[156] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .