Post-Deployment Anomaly Detection and Diagnosis in Networked Embedded Systems by Program Profiling and Symptom Mining

Detecting and diagnosing anomalies in networked embedded systems like sensor networks is a very difficult task, due to the variable workloads and severe resource constraints. In this paper, we focus on how to aid bug diagnosis after the system has been deployed. We notice that most node-level debugging tools can provide detailed program information inside the node but fail to detect when and where a problem occurs in the network. On the other hand, most network-level diagnosis tools can effectively detect a problem from the network but fail to narrow down the problem within the node because they lack detailed program information. To close the gap, we propose D2, a new method for post-deployment anomaly detection and diagnosis in networked embedded systems by combining program profiling and symptom mining. D2 employs binary instrumentation to perform lightweight function count profiling. Based on the statistics, D2 uses PCA (Principal Component Analysis) based approach for automatically detecting network anomalies. Compared with previous methods, D2 is able to point programmers closer to the most likely causes by a novel approach combining statistical tests and program call graph analysis. We implement our method based on TinyOS 2.1.1 and evaluate its effectiveness by case studies in the development of a working sensor network. Results show that our method can aid programmers to diagnose problems quickly in real-world sensor network systems, and at the same time, incurs an acceptable overhead to the running system.

[1]  Charles E. Perkins,et al.  Ad-hoc on-demand distance vector routing , 1999, Proceedings WMCSA'99. Second IEEE Workshop on Mobile Computing Systems and Applications.

[2]  Michael R. Lyu,et al.  Sentomist: Unveiling Transient Sensor Network Bugs via Symptom Mining , 2010, 2010 IEEE 30th International Conference on Distributed Computing Systems.

[3]  Philip Levis,et al.  Collection tree protocol , 2009, SenSys '09.

[4]  Patrick Th. Eugster,et al.  Diagnostic tracing for wireless sensor networks , 2013, TOSN.

[5]  Philip Levis,et al.  Data Discovery and Dissemination with DIP , 2008, 2008 International Conference on Information Processing in Sensor Networks (ipsn 2008).

[6]  Jens Palsberg,et al.  Avrora: scalable sensor network simulation with precise timing , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..

[7]  Klaus Wehrle,et al.  KleeNet: discovering insidious interaction bugs in wireless sensor networks before deployment , 2010, IPSN '10.

[8]  Pedro José Marrón,et al.  TinyLTS: Efficient network-wide Logging and Tracing System for TinyOS , 2011, 2011 Proceedings IEEE INFOCOM.

[9]  Yunhao Liu,et al.  Self-diagnosis for large scale wireless sensor networks , 2011, 2011 Proceedings IEEE INFOCOM.

[10]  Mani B. Srivastava,et al.  Scoped identifiers for efficient bit aligned logging , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[11]  Yunhao Liu,et al.  Agnostic diagnosis: Discovering silent failures in wireless sensor networks , 2011, 2011 Proceedings IEEE INFOCOM.

[12]  Peng Li,et al.  T-check: bug finding for sensor networks , 2010, IPSN '10.

[13]  Koen Langendoen,et al.  Murphy loves potatoes: experiences from a pilot sensor network deployment in precision agriculture , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[14]  David E. Culler,et al.  The dynamic behavior of a data dissemination protocol for network programming at scale , 2004, SenSys '04.

[15]  Jonathan W. Hui,et al.  T 2 : A Second Generation OS For Embedded Sensor Networks , 2005 .

[16]  Romain Thouvenin,et al.  Implementing and Evaluating the Dynamic Manet On-demand Protocol in Wireless Sensor Networks , 2007 .

[17]  Yunhao Liu,et al.  Does Wireless Sensor Network Scale? A Measurement Study on GreenOrbs , 2011, IEEE Transactions on Parallel and Distributed Systems.

[18]  Michael I. Jordan,et al.  Detecting large-scale system problems by mining console logs , 2009, SOSP '09.

[19]  Richard Han,et al.  NodeMD: diagnosing node-level faults in remote wireless sensor systems , 2007, MobiSys '07.

[20]  Yunhao Liu,et al.  Passive diagnosis for wireless sensor networks , 2010, TNET.

[21]  Gang Zhou,et al.  Achieving Repeatability of Asynchronous Events in Wireless Sensor Networks with EnviroLog , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[22]  Seung-Soon Im,et al.  Tool interface standard (TIS) executable and linking format (ELF) specification , 1995 .

[23]  J. E. Jackson,et al.  Control Procedures for Residuals Associated With Principal Component Analysis , 1979 .

[24]  Kamin Whitehouse,et al.  Declarative tracepoints: a programmable and application independent debugging system for wireless sensor networks , 2008, SenSys '08.

[25]  Deborah Estrin,et al.  Sympathy for the sensor network debugger , 2005, SenSys '05.

[26]  David E. Culler,et al.  Design of an application-cooperative management system for wireless sensor networks , 2005, Proceeedings of the Second European Workshop on Wireless Sensor Networks, 2005..

[27]  Michael R. Lyu,et al.  T-Morph: revealing buggy behaviors of TinyOS applications via rule mining and visualization , 2012, SIGSOFT FSE.

[28]  Yunhao Liu,et al.  Sherlock Is Around: Detecting Network Failures with Local Evidence Fusion , 2012, IEEE Transactions on Parallel and Distributed Systems.

[29]  David E. Culler,et al.  Taming the underlying challenges of reliable multihop routing in sensor networks , 2003, SenSys '03.

[30]  Jiawei Han,et al.  Dustminer: troubleshooting interactive complexity bugs in sensor networks , 2008, SenSys '08.

[31]  Kamin Whitehouse,et al.  Clairvoyant: a comprehensive source-level debugger for wireless sensor networks , 2007, SenSys '07.

[32]  Jennifer Neville,et al.  Structured Comparative Analysis of Systems Logs to Diagnose Performance Problems , 2012, NSDI.

[33]  Mark Crovella,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM '04.

[34]  Saurabh Bagchi,et al.  Aveksha: a hardware-software approach for non-intrusive tracing and profiling of wireless embedded systems , 2011, SenSys.

[35]  Donald E. Porter,et al.  Improved error reporting for software that uses black-box components , 2007, PLDI '07.

[36]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[37]  Roy Shea LIS is More : Improved Diagnostic Logging in Sensor Networks with Log Instrumentation Specifications , 2009 .

[38]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[39]  Gyula Simon,et al.  The flooding time synchronization protocol , 2004, SenSys '04.

[40]  Henrik Thane,et al.  Dynamic Patching of Embedded Software , 2007, 13th IEEE Real Time and Embedded Technology and Applications Symposium (RTAS'07).