Reinforcement Learning Approach to Managing Distributed Data Processing Tasks in Wireless Sensing Networks

As data processing capabilities and techniques continue to rapidly improve across disciplines, the modern engineering community has become increasingly reliant on sensor data to provide an accurate assessment of system behavior and performance [8, 12, 13]. However, because of the high cost of installing and maintaining data cables in large engineered systems, it is often impractical to install sensing transducers in sufficient numbers. As such, wireless sensing systems, which can be deployed at less than one-tenth of the cost of traditional tethered systems, are being explored as a new interface between sensing transducer and data repository. In addition their low costs, wireless sensing networks (WSNs) have also shown great promise because of their ability to process sensor data locally at each wireless node. Local data processing is especially advantageous when confronted with the huge amounts of data commonly associated with dense networks of sensors: transmitting only processed results (instead of raw sensor data) can drastically reduce the amount of data needing to be communicated. As such, many different architectures have been developed for embedded data processing using wireless sensors [4, 9, 10]. Recently, the WSN community has begun investigating increasingly parallel methods of in-network processing. For example, data aggregation and fusion techniques [11], query processing [14], and explicitly parallel architectures [21] have all been adopted in an attempt to create a framework for the autonomous, in-network processing of large tracts of sensor data. With these advances in mind, one of the key challenges yet to be overcome is that within a wireless environment many system resources (such as processing power and wireless bandwidth) required to perform complex computational tasks are available only in a limited manner. As such, in networks where multiple computational tasks may need to be executed simultaneously, it is important to devise an autonomous method of distributing these scarce system resources across multiple computational objectives. The goal of this study is to apply reinforcement learning (RL) techniques to create offline-trainable online-learning agents which can successfully allocate a WSN’s scarce resources across a set of simultaneous computational objectives.

[1]  Wei Zhang,et al.  A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[2]  Jerome P. Lynch,et al.  Decentralization of wireless monitoring and control technologies for smart civil structures , 2002 .

[3]  Wang-Chien Lee,et al.  Decentralizing query processing in sensor networks , 2005, The Second Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services.

[4]  Theodoros Loutas,et al.  A Novel Approach for Continuous Acoustic Emission Monitoring on Rotating Machinery Without the Use of Slip Ring , 2008 .

[5]  Reda Alhajj,et al.  State Similarity Based Approach for Improving Performance in RL , 2007, IJCAI.

[6]  Jerome P. Lynch,et al.  Market-based computational task assignment within autonomous wireless sensor networks , 2009, 2009 IEEE International Conference on Electro/Information Technology.

[7]  G. Agha,et al.  Model-based Data Aggregation for Structural Monitoring Employing Smart Sensors , 2006 .

[8]  G. Blöschl,et al.  The value of MODIS snow cover data in validating and calibrating conceptual hydrologic models , 2008 .

[9]  Mohan Kumar,et al.  Distributed Independent Reinforcement Learning (DIRL) Approach to Resource Management in Wireless Sensor Networks , 2007, 2007 IEEE Internatonal Conference on Mobile Adhoc and Sensor Systems.

[10]  J.P. Lynch,et al.  A Parallel Simulated Annealing Architecture for Model Updating in Wireless Sensor Networks , 2009, IEEE Sensors Journal.

[11]  Akira Sone,et al.  Prototype of sensor network with embedded local data processing , 2005, SPIE Smart Structures and Materials + Nondestructive Evaluation and Health Monitoring.

[12]  Deborah Estrin,et al.  Pricing in computer networks: motivation, formulation, and example , 1993, TNET.

[13]  Yu-Chi Ho,et al.  A Class of Center-Free Resource Allocation Algorithms 1 , 1980 .

[14]  Jen-Hung Huang,et al.  Price-based resource allocation for wireless ad hoc networks with multi-rate capability and energy constraints , 2008, Comput. Commun..

[15]  David Vengerov,et al.  A reinforcement learning framework for utility-based scheduling in resource-constrained systems , 2009, Future Gener. Comput. Syst..

[16]  Jerome P. Lynch,et al.  Embedding damage detection algorithms in a wireless sensing unit for operational power efficiency , 2004 .

[17]  H. F. Zhou,et al.  Modal Flexibility Analysis of Cable‐Stayed Ting Kau Bridge for Damage Identification , 2008, Comput. Aided Civ. Infrastructure Eng..

[18]  Marimuthu Palaniswami,et al.  Application-Oriented Flow Control for Wireless Sensor Networks , 2007, International Conference on Networking and Services (ICNS '07).

[19]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[20]  Peter Stone,et al.  Behavior transfer for value-function-based reinforcement learning , 2005, AAMAS '05.

[21]  Jerome P. Lynch,et al.  Design of a Wireless Sensor for Scalable Distributed In- Network Computation in a Structural Health Monitoring System , 2006 .