Decentralized Delay Optimal Control for Interference Networks With Limited Renewable Energy Storage

In this paper, we consider delay minimization for interference networks with renewable energy source, where the transmission power of a node comes from both the conventional utility power (ac power) and the renewable energy source. We assume the transmission power of each node is a function of the local channel state, local data queue state and local energy queue state only. We consider two delay optimization formulations, namely the decentralized partially observable Markov decision process (DEC-POMDP) and noncooperative partially observable stochastic game (POSG). In DEC-POMDP formulation, we derive a decentralized online learning algorithm to determine the control actions and Lagrangian multipliers (LMs) simultaneously, based on the policy gradient approach. Under some mild technical conditions, the proposed decentralized policy gradient algorithm converges almost surely to a local optimal solution. In the noncooperative POSG formulation, the transmitter nodes are noncooperative. We extend the decentralized policy gradient solution and establish the technical proof for almost-sure convergence of the learning algorithms. In both cases, the solutions are very robust to model variations.

[1]  Dusit Niyato,et al.  Sleep and Wakeup Strategies in Solar-Powered Wireless Sensor/Mesh Networks: Performance Analysis and Optimization , 2007, IEEE Transactions on Mobile Computing.

[2]  Kee-Eung Kim,et al.  Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.

[3]  Olivier Buffet,et al.  Shaping multi-agent systems with gradient reinforcement learning , 2007, Autonomous Agents and Multi-Agent Systems.

[4]  Shlomo Shamai,et al.  Optimal Power and Rate Control for Minimal Average Delay: The Single-User Case , 2006, IEEE Transactions on Information Theory.

[5]  Longbo Huang,et al.  Utility Optimal Scheduling in Energy-Harvesting Networks , 2010, IEEE/ACM Transactions on Networking.

[6]  Luigi Glielmo,et al.  New converse Lyapunov theorems and related results on exponential stability , 1998, Math. Control. Signals Syst..

[7]  Robert W. Heath,et al.  Cooperative Algorithms for MIMO Interference Channels , 2010, IEEE Transactions on Vehicular Technology.

[8]  Sergio Barbarossa,et al.  Competitive Design of Multiuser MIMO Systems Based on Game Theory: A Unified View , 2008, IEEE Journal on Selected Areas in Communications.

[9]  Lex Weaver,et al.  A Multi-Agent Policy-Gradient Approach to Network Routing , 2001, ICML.

[10]  Xi-Ren Cao,et al.  Stochastic learning and optimization - A sensitivity-based approach , 2007, Annu. Rev. Control..

[11]  Longbo Huang,et al.  Utility optimal scheduling in energy-harvesting networks , 2013, TNET.

[12]  Zhu Han,et al.  Non-cooperative resource competition game by virtual referee in multi-cell OFDMA networks , 2007, IEEE Journal on Selected Areas in Communications.

[13]  Ljupco Jorguseski,et al.  Energy Saving in Wireless Access Networks , 2010 .

[14]  Vinod Sharma,et al.  Optimal energy management policies for energy harvesting sensor nodes , 2008, IEEE Transactions on Wireless Communications.

[15]  Peter L. Bartlett,et al.  Experiments with Infinite-Horizon, Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..

[16]  M.J. Neely,et al.  Order Optimal Delay for Opportunistic Scheduling in Multi-User Wireless Uplinks and Downlinks , 2008, IEEE/ACM Transactions on Networking.

[17]  Vincent K. N. Lau,et al.  Cross-Layer Design for OFDMA Wireless Systems With Heterogeneous Delay Requirements , 2007, IEEE Transactions on Wireless Communications.

[18]  John N. Tsitsiklis,et al.  Simulation-based optimization of Markov reward processes , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).

[19]  B SrivastavaMani,et al.  Power management in energy harvesting sensor networks , 2007 .

[20]  Syed Ali Jafar,et al.  Approaching the Capacity of Wireless Networks through Distributed Interference Alignment , 2008, IEEE GLOBECOM 2008 - 2008 IEEE Global Telecommunications Conference.

[21]  John N. Tsitsiklis,et al.  Gradient Convergence in Gradient methods with Errors , 1999, SIAM J. Optim..

[22]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[23]  Mani B. Srivastava,et al.  Power management in energy harvesting sensor networks , 2007, TECS.

[24]  J. Gozalvez,et al.  Green Radio Technologies [Mobile Radio] , 2010, IEEE Vehicular Technology Magazine.

[25]  Daniel Pérez Palomar,et al.  Alternative Distributed Algorithms for Network Utility Maximization: Framework and Applications , 2007, IEEE Transactions on Automatic Control.

[26]  Syed Ali Jafar,et al.  Degrees of Freedom of the K User M times N MIMO Interference Channel , 2008, IEEE Trans. Inf. Theory.

[27]  Syed Ali Jafar,et al.  Interference Alignment and Degrees of Freedom of the $K$-User Interference Channel , 2008, IEEE Transactions on Information Theory.

[28]  V. Borkar Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .

[29]  Yang Song,et al.  Equilibrium Efficiency Improvement in MIMO Interference Systems: A Decentralized Stream Control Approach , 2007, IEEE Transactions on Wireless Communications.

[30]  John N. Tsitsiklis,et al.  Simulation-based optimization of Markov reward processes , 2001, IEEE Trans. Autom. Control..

[31]  Gerhard Fettweis,et al.  Power consumption modeling of different base station types in heterogeneous cellular networks , 2010, 2010 Future Network & Mobile Summit.

[32]  Genevieve Saur,et al.  Lifecycle Cost Analysis of Hydrogen Versus Other Technologies for Electrical Energy Storage , 2009 .

[33]  Vijay K. Bhargava,et al.  Wireless sensor networks with energy harvesting technologies: a game-theoretic approach to optimal energy management , 2007, IEEE Wireless Communications.

[34]  Syed Ali Jafar,et al.  Degrees of Freedom of the K User M×N MIMO Interference Channel , 2008, ArXiv.

[35]  Yan Chen,et al.  Delay-optimal power and precoder adaptation for multi-stream MIMO systems , 2009, IEEE Transactions on Wireless Communications.