Efficient probe selection algorithms for fault diagnosis

Increase in the network usage for more and more performance critical applications has caused a demand for tools that can monitor network health with minimum management traffic. Adaptive probing has the potential to provide effective tools for end-to-end monitoring and fault diagnosis over a network. Adaptive probing based algorithms adapt the probe set to localize faults in the network by sending less probes in healthy areas and more probes in the suspected areas of failure. In this paper we present adaptive probing tools that meet the requirements to provide an effective and efficient solution for fault diagnosis for modern communication systems. We present a system architecture for adaptive probing based fault diagnosis tool and propose algorithms for probe selection to perform failure detection and fault localization. We compare the performance and efficiency of the proposed algorithms through simulation results.

[1]  Randy H. Katz,et al.  An algebraic approach to practical and scalable overlay network monitoring , 2004, SIGCOMM 2004.

[2]  Bobby Bhattacharjee,et al.  Scalable application layer multicast , 2002, SIGCOMM '02.

[3]  Philip K. McKinley,et al.  On the cost-quality tradeoff in topology-aware overlay path probing , 2003, 11th IEEE International Conference on Network Protocols, 2003. Proceedings..

[4]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[5]  Sheng Ma,et al.  Adaptive diagnosis in distributed systems , 2005, IEEE Transactions on Neural Networks.

[6]  Allen B. Downey Using pathchar to estimate Internet link characteristics , 1999, SIGCOMM '99.

[7]  Srinivasan Seshan,et al.  A case for end system multicast , 2002, IEEE J. Sel. Areas Commun..

[8]  Mary Baker,et al.  Measuring bandwidth , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[9]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[10]  Ehab Al-Shaer,et al.  QoS Path Monitoring for Multicast Networks , 2002, Journal of Network and Systems Management.

[11]  Ben Y. Zhao,et al.  Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and , 2001 .

[12]  Hari Balakrishnan,et al.  Resilient overlay networks , 2001, SOSP.

[13]  Lixia Zhang,et al.  Host multicast: a framework for delivering multicast to end users , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[14]  Sheng Ma,et al.  Optimizing Probe Selection for Fault Localization , 2001, DSOM.

[15]  Aaron B. Brown,et al.  An active approach to characterizing dynamic dependencies for problem determination in a distributed environment , 2001, 2001 IEEE/IFIP International Symposium on Integrated Network Management Proceedings. Integrated Network Management VII. Integrated Management Strategies for the New Millennium (Cat. No.01EX470).

[16]  Russell R. Barton,et al.  Managing End-to-End Network Performance via Optimized Monitoring Strategies , 2004, Journal of Network and Systems Management.

[17]  R. Rastogi,et al.  Robust Monitoring of Link Delays and Faults , 2006 .

[18]  Fei Li,et al.  End-to-End Service Quality Measurement Using Source-Routed Probes , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[19]  Jie Gao,et al.  Approaches to building self healing systems using dependency analysis , 2004, 2004 IEEE/IFIP Network Operations and Management Symposium (IEEE Cat. No.04CH37507).

[20]  Parameswaran Ramanathan,et al.  What do packet dispersion techniques measure? , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[21]  Rajeev Rastogi,et al.  Robust Monitoring of Link Delays and Faults in IP Networks , 2003, IEEE/ACM Transactions on Networking.

[22]  Peter Steenkiste,et al.  Towards Tunable Measurement Techniques for Available Bandwidth , 2003 .

[23]  Richard G. Baraniuk,et al.  Spatio-temporal available bandwidth estimation with STAB , 2004, SIGMETRICS '04/Performance '04.

[24]  Russell R. Barton,et al.  Zone recovery methodology for probe-subset selection in end-to-end network monitoring , 2002, NOMS 2002. IEEE/IFIP Network Operations and Management Symposium. ' Management Solutions for the New Communications World'(Cat. No.02CH37327).

[25]  Ehab Al-Shaer,et al.  Active integrated fault localization in communication networks , 2005, 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005..

[26]  Randy H. Katz,et al.  Tomography-based overlay network monitoring , 2003, IMC '03.

[27]  Genady Grabarnik,et al.  Active Probing , 2002 .

[28]  Maitreya Natu,et al.  Probe Station Placement for Robust Monitoring of Networks , 2008, Journal of Network and Systems Management.

[29]  Richard G. Baraniuk,et al.  pathChirp: Efficient available bandwidth estimation for network paths , 2003 .

[30]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[31]  Mark Crovella,et al.  Server selection using dynamic path characterization in wide-area networks , 1997, Proceedings of INFOCOM '97.

[32]  K. Claffy,et al.  Topology discovery by active probing , 2002, Proceedings 2002 Symposium on Applications and the Internet (SAINT) Workshops.

[33]  Richard G. Baraniuk,et al.  Multifractal Cross-Traffic Estimation , 2000 .

[34]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[35]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[36]  Peter Steenkiste,et al.  Evaluation and characterization of available bandwidth probing techniques , 2003, IEEE J. Sel. Areas Commun..

[37]  Manish Jain,et al.  End-to-end available bandwidth: measurement methodology, dynamics, and relation with TCP throughput , 2002, SIGCOMM 2002.