A distributed approach to node clustering in decentralized peer-to-peer networks

Connectivity-based node clustering has wide-ranging applications in decentralized peer-to-peer (P2P) networks such as P2P file sharing systems, mobile ad-hoc networks, P2P sensor networks, and so forth. This paper describes a connectivity-based distributed node clustering scheme (CDC). This scheme presents a scalable and efficient solution for discovering connectivity-based clusters in peer networks. In contrast to centralized graph clustering algorithms, the CDC scheme is completely decentralized and it only assumes the knowledge of neighbor nodes instead of requiring a global knowledge of the network (graph) to be available. An important feature of the CDC scheme is its ability to cluster the entire network automatically or to discover clusters around a given set of nodes. To cope with the typical dynamics of P2P networks, we provide mechanisms to allow new nodes to be incorporated into appropriate existing clusters and to gracefully handle the departure of nodes in the clusters. These mechanisms enable the CDC scheme to be extensible and adaptable in the sense that the clustering structure of the network adjusts automatically as nodes join or leave the system. We provide detailed experimental evaluations of the CDC scheme, addressing its effectiveness in discovering good quality clusters and handling the node dynamics. We further study the types of topologies that can benefit best from the connectivity-based distributed clustering algorithms like CDC. Our experiments show that utilizing message-based connectivity structure can considerably reduce the messaging cost and provide better utilization of resources, which in turn improves the quality of service of the applications executing over decentralized peer-to-peer networks.

[1]  Edward J. Coyle,et al.  An energy efficient hierarchical clustering algorithm for wireless sensor networks , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[2]  Deborah Estrin,et al.  An energy-efficient MAC protocol for wireless sensor networks , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[3]  Arunabha Sen,et al.  Graph Clustering Using Multiway Ratio Cut , 1997, GD.

[4]  Matei Ripeanu,et al.  Peer-to-peer architecture case study: Gnutella network , 2001, Proceedings First International Conference on Peer-to-Peer Computing.

[5]  Dhiraj K. Pradhan,et al.  A cluster-based approach for routing in dynamic networks , 1997, CCRV.

[6]  S. M. Heemstra de Groot,et al.  Power-aware routing in mobile ad hoc networks , 1998, MobiCom '98.

[7]  Ravi Prakash,et al.  Max-min d-cluster formation in wireless ad hoc networks , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[8]  Stefano Basagni,et al.  Distributed clustering for ad hoc networks , 1999, Proceedings Fourth International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN'99).

[9]  Peng Wei,et al.  Efficient Broadcast in Mobile Ad Hoc Networks Using Connected Dominating Sets , 2001 .

[10]  Krishna P. Gummadi,et al.  Measurement, modeling, and analysis of a peer-to-peer file-sharing workload , 2003, SOSP '03.

[11]  Richard J. Lipton,et al.  Random walks, universal traversal sequences, and the complexity of maze problems , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[12]  S. Dongen Performance criteria for graph clustering and Markov cluster experiments , 2000 .

[13]  Scott Shenker,et al.  Making gnutella-like P2P systems scalable , 2003, SIGCOMM '03.

[14]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[15]  Arthur L. Liestman,et al.  Approximating minimum size weakly-connected dominating sets for clustering mobile ad hoc networks , 2002, MobiHoc '02.

[16]  Wei Hong,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Tag: a Tiny Aggregation Service for Ad-hoc Sensor Networks , 2022 .

[17]  Mario Gerla,et al.  Adaptive Clustering for Mobile Wireless Networks , 1997, IEEE J. Sel. Areas Commun..

[18]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[19]  Hector Garcia-Molina,et al.  Comparing Hybrid Peer-to-Peer Systems , 2001, VLDB.

[20]  Aravind Srinivasan,et al.  Fast distributed algorithms for (weakly) connected dominating sets and linear-size skeletons , 2003, J. Comput. Syst. Sci..

[21]  Gene Tsudik,et al.  Flooding for Reliable Multicast in Multi-Hop Ad Hoc Networks , 1999, DIALM '99.

[22]  Jie Wu,et al.  A Dominating-Set-Based Routing Scheme in Ad Hoc Wireless Networks , 2001, Telecommun. Syst..

[23]  Sajal K. Das,et al.  WCA: A Weighted Clustering Algorithm for Mobile Ad Hoc Networks , 2002, Cluster Computing.

[24]  Julio Solano-González,et al.  Connectivity Based k-Hop Clustering in Wireless Networks , 2002, Proceedings of the 35th Annual Hawaii International Conference on System Sciences.

[25]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[26]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[27]  Peng-Jun Wan,et al.  Message-optimal connected dominating sets in mobile ad hoc networks , 2002, MobiHoc '02.

[28]  S. Dongen A new cluster algorithm for graphs , 1998 .

[29]  Arne Frick,et al.  Automatic Graph Clustering , 1996, GD.

[30]  Tracy Camp,et al.  Comparison of broadcasting techniques for mobile ad hoc networks , 2002, MobiHoc '02.

[31]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[32]  Ivan Stojmenovic,et al.  Clustering and Routing in Mobile Wireless Networks , 1999 .

[33]  Christian Bettstetter,et al.  Scenario-based stability anlysis of the distributed mobility-adaptive clustering (DMAC) algorithm , 2001, MobiHoc '01.

[34]  Jie Wu,et al.  Dominating-set-based routing in ad hoc wireless networks , 2002 .

[35]  Ling Liu,et al.  PeerCQ: a decentralized and self-configuring peer-to-peer information monitoring system , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[36]  Alan M. Frieze,et al.  Clustering in large graphs and matrices , 1999, SODA '99.

[37]  L. Asz Random Walks on Graphs: a Survey , 2022 .

[38]  Franz Rendl,et al.  A computational study of graph partitioning , 1994, Math. Program..

[39]  Wei Peng,et al.  On the reduction of broadcast redundancy in mobile ad hoc networks , 2000, 2000 First Annual Workshop on Mobile and Ad Hoc Networking and Computing. MobiHOC (Cat. No.00EX444).

[40]  David E. Culler,et al.  Supporting aggregate queries over ad-hoc wireless sensor networks , 2002, Proceedings Fourth IEEE Workshop on Mobile Computing Systems and Applications.

[41]  Johannes Gehrke,et al.  Query Processing in Sensor Networks , 2003, CIDR.