Decentralised reinforcement learning for energy-efficient scheduling in wireless sensor networks

We present a self-organising reinforcement learning (RL) approach for scheduling the wake-up cycles of nodes in a wireless sensor network. The approach is fully decentralised, and allows sensor nodes to schedule their active periods based only on their interactions with neighbouring nodes. Compared to standard scheduling mechanisms such as SMAC, the benefits of the proposed approach are twofold. First, the nodes do not need to synchronise explicitly, since synchronisation is achieved by the successful exchange of data messages in the data collection process. Second, the learning process allows nodes competing for the radio channel to desynchronise in such a way that radio interferences and therefore packet collisions are significantly reduced. This results in shorter communication schedules, allowing to not only reduce energy consumption by reducing the wake-up cycles of sensor nodes, but also to decrease the data retrieval latency. We implement this RL approach in the OMNET++ sensor network simulator, and illustrate how sensor nodes arranged in line, mesh and grid topologies autonomously uncover schedules that favour the successful delivery of messages along a routing tree while avoiding interferences.

[1]  Richard R. Brooks Handbook of Sensor Networks: Compact Wireless and Wired Sensing Systems , 2008 .

[2]  David E. Culler,et al.  Mica: A Wireless Platform for Deeply Embedded Networks , 2002, IEEE Micro.

[3]  Deborah Estrin,et al.  Medium access control with coordinated adaptive sleeping for wireless sensor networks , 2004, IEEE/ACM Transactions on Networking.

[4]  Amre El-Hoiydi Aloha with preamble sampling for sporadic traffic in ad hoc wireless sensor networks , 2002, 2002 IEEE International Conference on Communications. Conference Proceedings. ICC 2002 (Cat. No.02CH37333).

[5]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[6]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[7]  Jan M. Rabaey,et al.  Low power distributed MAC for ad hoc sensor radio networks , 2001, GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270).

[8]  David E. Culler,et al.  Taming the underlying challenges of reliable multihop routing in sensor networks , 2003, SenSys '03.

[9]  Tomasz Imielinski,et al.  Etiquette protocol for ultra low power operation in energy constrained sensor networks , 2005 .

[10]  Paramvir Bahl,et al.  Wake on wireless: an event driven energy saving strategy for battery operated devices , 2002, MobiCom '02.

[11]  Robert Tappan Morris,et al.  a high-throughput path metric for multi-hop wireless routing , 2003, MobiCom '03.

[12]  JAMAL N. AL-KARAKI,et al.  Routing techniques in wireless sensor networks: a survey , 2004, IEEE Wireless Communications.

[13]  Zhenzhen Liu,et al.  RL-MAC: a reinforcement learning based MAC protocol for wireless sensor networks , 2006, Int. J. Sens. Networks.

[14]  Curt Schurgers,et al.  Wakeup Strategies in Wireless Sensor Networks , 2008 .

[15]  Koen Langendoen,et al.  Efficient broadcasting protocols for regular wireless sensor networks , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[16]  Rajesh Gupta,et al.  Dynamic power management using on demand paging for networked embedded systems , 2005, Proceedings of the ASP-DAC 2005. Asia and South Pacific Design Automation Conference, 2005..

[17]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[18]  Ian F. Akyildiz,et al.  Sensor Networks , 2002, Encyclopedia of GIS.