A task scheduling algorithm based on Q-learning and shared value function for WSNs

Abstract In dynamic Wireless Sensor Networks (WSNs), each sensor node should be allowed to schedule tasks by itself based on current environmental changes. Task scheduling on each sensor node should be done online towards balancing the tradeoff between resources utilization and application performance. In order to solve the problem of frequent exchange of cooperative information in existing cooperative learning algorithms, a task scheduling algorithm based on Q-learning and shared value function for WSNs, QS is proposed. Specifically, the task model for target monitoring applications and the cooperative Q-learning model are both established, and some basic elements of reinforcement learning including the delayed rewards and the state space are also defined. Moreover, according to the characteristic of the value of the function change, QS designs the sending constraint and the expired constraint of state value to reduce the switching frequency of cooperative information while guaranteeing the cooperative learning effect. Experimental results on NS3 show that QS can perform task scheduling dynamically according to current environmental changes; compared with other cooperative learning algorithms, QS achieves better application performance with achievable energy consumption and also makes each sensor node complete its functionality job normally.

[1]  Wendi Heinzelman,et al.  Energy-efficient communication protocol for wireless microsensor networks , 2000, Proceedings of the 33rd Annual Hawaii International Conference on System Sciences.

[2]  Cheng Pan,et al.  Task Allocation for Wireless Sensor Network Using Modified Binary Particle Swarm Optimization , 2014, IEEE Sensors Journal.

[3]  Zhu Yao-qing Metropolis Policy-based Multi-step Q Learning Algorithm and Performance Simulation , 2007 .

[4]  Yang Liu,et al.  A new Q-learning algorithm based on the metropolis criterion , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[5]  Louis Wehenkel,et al.  Clinical data based optimal STI strategies for HIV: a reinforcement learning approach , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[6]  Matt Welsh,et al.  Decentralized, adaptive resource allocation for sensor networks , 2005, NSDI.

[7]  Roy Want,et al.  Making Everyday Life Easier Using Dense Sensor Networks , 2001, UbiComp.

[8]  Faruk Bagci Energy-efficient Communication Protocol for Wireless Sensor Networks , 2016, Ad Hoc Sens. Wirel. Networks.

[9]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[10]  Bernhard Rinner,et al.  Resource coordination in wireless sensor networks by cooperative reinforcement learning , 2012, 2012 IEEE International Conference on Pervasive Computing and Communications Workshops.

[11]  Andrew W. Moore,et al.  Distributed Value Functions , 1999, ICML.

[12]  Jan Peters,et al.  Fitted Q-iteration by Advantage Weighted Regression , 2008, NIPS.

[13]  Mohan Kumar,et al.  Distributed resource management in wireless sensor networks using reinforcement learning , 2012, Wireless Networks.

[14]  Aurel Stefan Gontean,et al.  A reinforcement learning strategy for task scheduling of WSNs with mobile nodes , 2013, 2013 36th International Conference on Telecommunications and Signal Processing (TSP).

[15]  Zhiwen Zeng,et al.  A Highly Efficient DAG Task Scheduling Algorithm for Wireless Sensor Networks , 2008, 2008 The 9th International Conference for Young Computer Scientists.

[16]  Ian F. Akyildiz,et al.  A survey on wireless multimedia sensor networks , 2007, Comput. Networks.

[17]  Mohan Kumar,et al.  Distributed Independent Reinforcement Learning (DIRL) Approach to Resource Management in Wireless Sensor Networks , 2007, 2007 IEEE Internatonal Conference on Mobile Adhoc and Sensor Systems.

[18]  Bernhard Rinner,et al.  Performance Analysis of Resource-Aware Task Scheduling Methods in Wireless Sensor Networks , 2014, Int. J. Distributed Sens. Networks.

[19]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[20]  Kay Römer,et al.  Algorithms for generic role assignment in wireless sensor networks , 2005, SenSys '05.

[21]  Chong Shen,et al.  Reinforcement learning models for scheduling in wireless networks , 2013, Frontiers of Computer Science.