Caching Transient Content for IoT Sensing: Multi-Agent Soft Actor-Critic

Edge nodes (ENs) in Internet of Things commonly serve as gateways to cache sensing data while providing accessing services for data consumers. This paper considers multiple ENs that cache sensing data under the coordination of the cloud. Particularly, each EN can fetch content generated by sensors within its coverage, which can be uploaded to the cloud via fronthaul and then be delivered to other ENs beyond the communication range. However, sensing data are usually transient with time whereas frequent cache updates could lead to considerable energy consumption at sensors and fronthaul traffic loads. Therefore, we adopt age of information to evaluate data freshness and investigate intelligent caching policies to preserve data freshness while reducing cache update costs. Specifically, we model the cache update problem as a cooperative multi-agent Markov decision process with the goal of minimizing the long-term average weighted cost. To efficiently handle the exponentially large number of actions, we devise a novel reinforcement learning approach, which is a discrete multi-agent variant of soft actor-critic (SAC). Furthermore, we generalize the proposed approach into a decentralized control, where each EN can make decisions based on local observations only. Simulation results demonstrate the superior performance of the proposed SAC-based caching schemes.

[1]  Marian Codreanu,et al.  On the Age of Information in Status Update Systems With Packet Management , 2015, IEEE Transactions on Information Theory.

[2]  Xiang Chen,et al.  Energy-Age Tradeoff in Status Update Communication Systems with Retransmission , 2018, 2018 IEEE Global Communications Conference (GLOBECOM).

[3]  Manyou Ma,et al.  A Deep Reinforcement Learning Approach for Dynamic Contents Caching in HetNets , 2020, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).

[4]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[5]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[6]  Jun Li,et al.  Distributed Caching for Data Dissemination in the Downlink of Heterogeneous Networks , 2015, IEEE Transactions on Communications.

[7]  Tamer Basar,et al.  Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms , 2019, Handbook of Reinforcement Learning and Control.

[8]  Harpreet S. Dhillon,et al.  A Reinforcement Learning Framework for Optimizing Age of Information in RF-Powered Communication Systems , 2019, IEEE Transactions on Communications.

[9]  Dario Pompili,et al.  Collaborative Mobile Edge Computing in 5G Networks: New Paradigms, Scenarios, and Challenges , 2016, IEEE Communications Magazine.

[10]  Deniz Gündüz,et al.  Reinforcement Learning to Minimize Age of Information with an Energy Harvesting Sensor with HARQ and Sensing Cost , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[11]  Walid Saad,et al.  Joint Status Sampling and Updating for Minimizing Age of Information in the Internet of Things , 2018, IEEE Transactions on Communications.

[12]  Gonzalo Mateos,et al.  Health Monitoring and Management Using Internet-of-Things (IoT) Sensing with Cloud-Based Processing: Opportunities and Challenges , 2015, 2015 IEEE International Conference on Services Computing.

[13]  Deniz Gündüz,et al.  A Reinforcement Learning Approach to Age of Information in Multi-User Networks , 2018, 2018 IEEE 29th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC).

[14]  Alireza Sadeghi,et al.  Optimal and Scalable Caching for 5G Using Reinforcement Learning of Space-Time Popularities , 2017, IEEE Journal of Selected Topics in Signal Processing.

[15]  He Chen,et al.  Pricing and Resource Allocation via Game Theory for a Small-Cell Video Caching System , 2016, IEEE Journal on Selected Areas in Communications.

[16]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[17]  Xiaofei Wang,et al.  Hierarchical Edge Caching in Device-to-Device Aided Mobile Networks: Modeling, Optimization, and Design , 2018, IEEE Journal on Selected Areas in Communications.

[18]  Tao Jiang,et al.  Caching Transient Data for Internet of Things: A Deep Reinforcement Learning Approach , 2019, IEEE Internet of Things Journal.

[19]  W. C. Y. Lee,et al.  Estimate of channel capacity in Rayleigh fading environment , 1990 .

[20]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[21]  Marian Codreanu,et al.  Age-Aware Status Update Control for Energy Harvesting IoT Sensors via Reinforcement Learning , 2020, 2020 IEEE 31st Annual International Symposium on Personal, Indoor and Mobile Radio Communications.

[22]  Deniz Gündüz,et al.  Age-of-Information With Information Source Diversity in an Energy Harvesting System , 2020, IEEE Transactions on Green Communications and Networking.

[23]  Victor C. M. Leung,et al.  Joint Fronthaul Multicast and Cooperative Beamforming for Cache-Enabled Cloud-Based Small Cell Networks: An MDS Codes-Aided Approach , 2019, IEEE Transactions on Wireless Communications.

[24]  Jingjing Yao,et al.  Caching in Dynamic IoT Networks by Deep Reinforcement Learning , 2021, IEEE Internet of Things Journal.

[25]  Zhu Han,et al.  Multi-Hop Cooperative Caching in Social IoT Using Matching Theory , 2018, IEEE Transactions on Wireless Communications.

[26]  Hongguang Sun,et al.  AoI and Energy Consumption Oriented Dynamic Status Updating in Caching Enabled IoT Networks , 2020, IEEE INFOCOM 2020 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[27]  Gerhard Bauch,et al.  A Spatiotemporal Framework for Information Freshness in IoT Uplink Networks , 2020, 2020 IEEE 92nd Vehicular Technology Conference (VTC2020-Fall).

[28]  Roy D. Yates,et al.  Update or wait: How to keep your data fresh , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[29]  Victor C. M. Leung,et al.  Joint Long-Term Cache Updating and Short-Term Content Delivery in Cloud-Based Small Cell Networks , 2020, IEEE Transactions on Communications.

[30]  Mumbai,et al.  Internet of Things (IoT): A Literature Review , 2015 .

[31]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[32]  Henry Zhu,et al.  Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[33]  Eytan Modiano,et al.  Optimizing age-of-information in a multi-class queueing system , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[34]  Branka Vucetic,et al.  Timely Status Update in Internet of Things Monitoring Systems: An Age-Energy Tradeoff , 2019, IEEE Internet of Things Journal.

[35]  Purnendu Mishra,et al.  Polynomial Learning Rate Policy with Warm Restart for Deep Neural Network , 2019, TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON).

[36]  Ursula Challita,et al.  Artificial Neural Networks-Based Machine Learning for Wireless Networks: A Tutorial , 2017, IEEE Communications Surveys & Tutorials.

[37]  Victor C. M. Leung,et al.  Software-Defined Networks with Mobile Edge Computing and Caching for Smart Cities: A Big Data Deep Reinforcement Learning Approach , 2017, IEEE Communications Magazine.

[38]  Roy D. Yates,et al.  Real-time status: How often should one update? , 2012, 2012 Proceedings IEEE INFOCOM.

[39]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.