Optimizing age of information on real-life TCP/IP connections through reinforcement learning

Age of Information (AoI) has emerged as a performance metric capturing the freshness of data for status-update based applications (e.g., remote monitoring) as a more suitable alternative to classical network performance indicators such as throughput or delay. Optimizing AoI often requires distinctly novel and sometimes counter-intuitive networking policies that adapt the rate of update transmissions to the randomness in network resources. However, almost all previous work on AoI to data has been theoretical, assuming idealized networking models, and known delay and service time distributions. It is difficult to obtain these statistics and optimize for them in a real-life network as there are many interacting phenomena in different networking layers (e.g., consider an end-to-end IoT application running over the Internet). With this work we introduce a deep reinforcement learning-based approach that can learn to minimize the AoI with no prior assumptions about network topology. After evaluating the learning model on an emulated network, we have shown that the method can be scaled up to any realistic network with unknown delay distribution.

[1]  Deniz Gündüz,et al.  Average Age of Information With Hybrid ARQ Under a Resource Constraint , 2019, IEEE Transactions on Wireless Communications.

[2]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .

[3]  Roy D. Yates,et al.  Real-time status updating: Multiple sources , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[5]  Roy D. Yates,et al.  Lazy is timely: Status updates by an energy harvesting source , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[6]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[7]  Roy D. Yates,et al.  Update or wait: How to keep your data fresh , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[8]  Elif Uysal-Biyikoglu,et al.  Implementation of an Enhanced Target Localization and Identification Algorithm on a Magnetic WSN , 2015, IEICE Trans. Commun..

[9]  Elif Uysal-Biyikoglu,et al.  Scheduling status updates to minimize age of information with an energy harvesting sensor , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[10]  Roy D. Yates,et al.  Real-time status: How often should one update? , 2012, 2012 Proceedings IEEE INFOCOM.

[11]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[12]  Elif Uysal-Biyikoglu,et al.  Achieving the Age-Energy Tradeoff with a Finite-Battery Energy Harvesting Source , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[13]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[14]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[15]  Rajai Nasser,et al.  Age of information: The gamma awakening , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[16]  Ness B. Shroff,et al.  Optimizing data freshness, throughput, and delay in multi-server information-update systems , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[17]  Kurt Hornik,et al.  Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.