Decentralized Computation of Threshold Crossing Alerts

Threshold crossing alerts (TCAs) indicate to a management system that a management variable, associated with the state, performance or health of the network, has crossed a certain threshold. The timely detection of TCAs is essential to proactive management. This paper focuses on detecting TCAs for network-level variables, which are computed from device-level variables using aggregation functions, such as SUM, MAX, or AVERAGE. It introduces TCA-GAP, a novel protocol for producing network-wide TCAs in a scalable and robust manner. The protocol maintains a spanning tree and uses local thresholds, which adapt to changes in network state and topology, by allowing nodes to trade unused “threshold space”. Scalability is achieved through computing the thresholds locally and through distributing the aggregation process across all nodes. Fault-tolerance is achieved by a mechanism that reconstructs the spanning tree after node addition, removal or failure. Simulation results on an ISP topology show that the protocol successfully concentrates traffic overhead to periods where the aggregate is close to the given threshold.

[1]  Nick Roussopoulos,et al.  Hierarchical In-Network Data Aggregation with Quality Guarantees , 2004, EDBT.

[2]  Robbert van Renesse The Importance of Aggregation , 2003, Future Directions in Distributed Computing.

[3]  Indranil Gupta,et al.  Scalable fault-tolerant aggregation in large process groups , 2001, 2001 International Conference on Dependable Systems and Networks.

[4]  Rajeev Motwani,et al.  Estimating Aggregates on a Peer-to-Peer Network , 2003 .

[5]  Rolf Stadler,et al.  Weaver: Realizing a Scalable Management Paradigm on Commodity Routers , 2003, Integrated Network Management.

[6]  He Zhu Interconnections, 2nd Ed. , 2000 .

[7]  Mohamed A. Sharaf,et al.  TiNA: a scheme for temporal coherency-aware in-network aggregation , 2003, MobiDe '03.

[8]  Samuel Madden,et al.  TAG: a Tiny Aggregation Tree for ad-hoc sensor networks , 2002, OSDI 2002.

[9]  Jennifer Widom,et al.  Adaptive filters for continuous queries over distributed data streams , 2003, SIGMOD '03.

[10]  Ratul Mahajan,et al.  Measuring ISP topologies with rocketfuel , 2002, TNET.

[11]  Ben Y. Zhao,et al.  Future Directions in Distributed Computing , 2003, Lecture Notes in Computer Science.

[12]  Johannes Gehrke,et al.  Gossip-based computation of aggregate information , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[13]  Danny Raz,et al.  Efficient reactive monitoring , 2002, IEEE J. Sel. Areas Commun..

[14]  Kenneth P. Birman The surprising power of epidemic communication , 2003 .

[15]  Danny Dolev,et al.  Accounting Mechanism for Membership Size-Dependent Pricing of Multicast Traffic , 2003, Networked Group Communication.

[16]  B. Dang,et al.  Interconnections, second edition: bridges, routers, switches, and internetworking protocols [Bookshelf] , 2000, IEEE Software.

[17]  Amos Israeli,et al.  Self-stabilization of dynamic systems assuming only read/write atomicity , 1990, PODC '90.

[18]  Wei Hong,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Tag: a Tiny Aggregation Service for Ad-hoc Sensor Networks , 2022 .

[19]  Amos Israeli,et al.  Self-Stabilization of Dynamic Systems Assuming only Read/Write Atomicity , 1990, PODC.