Distributed correlated Q-learning for dynamic transmission control of sensor networks

This paper considers a Markovian dynamical game theoretic setting for distributed transmission control in a wireless sensor network. The available spectrum bandwidth is modeled as a Markov chain. A distributed algorithm named correlated Q-learning algorithm is proposed to obtain the correlated equilibrium policies of the system. This algorithm has the decentralized feature and is easily implementable in a real system. Numerical example is also provided to verify the performances of the proposed algorithms.

[1]  Wei Yu,et al.  WSN11-1: Distributed Cross-Layer Optimization of Wireless Sensor Networks: A Game Theoretic Approach , 2006, IEEE Globecom 2006.

[2]  Jeong Geun Kim,et al.  An Energy-Efficient Transmission Strategy for Wireless Sensor Networks , 2007, 2007 IEEE Wireless Communications and Networking Conference.

[3]  Alf Isaksson,et al.  On sensor scheduling via information theoretic criteria , 1999, Proceedings of the 1999 American Control Conference (Cat. No. 99CH36251).

[4]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[5]  Gang George Yin,et al.  Decentralized Adaptive Filtering Algorithms for Sensor Activation in an Unattended Ground Sensor Network , 2008, IEEE Transactions on Signal Processing.

[6]  Robert A. Scholtz,et al.  Multiple access with time-hopping impulse modulation , 1993, Proceedings of MILCOM '93 - IEEE Military Communications Conference.

[7]  Anurag Kumar,et al.  Optimal Sleep-Wake Scheduling for Quickest Intrusion Detection Using Wireless Sensor Networks , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[8]  K. J. Ray Liu,et al.  Near-optimal reinforcement learning framework for energy-aware sensor communications , 2005, IEEE Journal on Selected Areas in Communications.

[9]  R. Aumann Subjectivity and Correlation in Randomized Strategies , 1974 .

[10]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[11]  Qing Zhao,et al.  Transmission Scheduling for Optimizing Sensor Network Lifetime: A Stochastic Shortest Path Approach , 2007, IEEE Transactions on Signal Processing.

[12]  Vikram Krishnamurthy,et al.  MIMO Transmission Control in Fading Channels—A Constrained Markov Decision Process Formulation With Monotone Randomized Policies , 2007, IEEE Transactions on Signal Processing.

[13]  V. Krishnamurthy Decentralized Activation in Dense Sensor Networks via Global Games , 2008, IEEE Transactions on Signal Processing.

[14]  R. Aumann Correlated Equilibrium as an Expression of Bayesian Rationality Author ( s ) , 1987 .

[15]  H. Vincent Poor,et al.  A large deviations approach to sensor scheduling for detection of correlated random fields , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[16]  K. Premkumar,et al.  Optimal Sleep – Wake Scheduling for Quickest Intrusion Detection using Sensor Networks , 2009 .

[17]  Ying He,et al.  Sensor scheduling for target tracking in sensor networks , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).