Plexus: A Scalable Peer-to-Peer Protocol Enabling Efficient Subset Search

Efficient discovery of information, based on partial knowledge, is a challenging problem faced by many large scale distributed systems. This paper presents Plexus, a peer-to-peer search protocol that provides an efficient mechanism for advertising a bit-sequence (pattern), and discovering it using any subset of its 1-bits. A pattern (e.g., Bloom filter) summarizes the properties (e.g., keywords, service description) associated with a shared object (e.g., document, service). Plexus has a partially decentralized architecture involving super-peers. It adopts a novel structured routing mechanism derived from the theory of error correcting codes (ECC). Plexus achieves better resilience to peer failure by utilizing replication and redundant routing paths. Routing efficiency in Plexus scales logarithmically with the number of superpeers. The concept presented in this paper is supported with theoretical analysis, and simulation results obtained from the application of Plexus to partial keyword search utilizing the extended Golay code.

[1]  Vasek Chvátal,et al.  A Greedy Heuristic for the Set-Covering Problem , 1979, Math. Oper. Res..

[2]  Min Cai,et al.  RDFPeers: a scalable distributed RDF repository based on a structured peer-to-peer network , 2004, WWW '04.

[3]  David J. DeWitt,et al.  Locating Data Sources in Large Distributed Systems , 2003, VLDB.

[4]  Scott Shenker,et al.  Making gnutella-like P2P systems scalable , 2003, SIGCOMM '03.

[5]  Venkatesan Guruswami,et al.  Extensions to the Johnson bound , 2001 .

[6]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[7]  Moshe Lewenstein,et al.  Approximate subset matching with Don't Cares , 2001, SODA '01.

[8]  Michael B. Jones,et al.  SkipNet: A Scalable Overlay Network with Practical Locality Properties , 2003, USENIX Symposium on Internet Technologies and Systems.

[9]  Scott Shenker,et al.  Complex Queries in Dht-based Peer-to-peer Networks , 2002 .

[10]  Li Jinsheng,et al.  FS-Chord: A New P2P Model with Fractional Steps Joining , 2006, Advanced Int'l Conference on Telecommunications and Int'l Conference on Internet and Web Applications and Services (AICT-ICIW'06).

[11]  Daniel Stutzbach,et al.  Characterizing Unstructured Overlay Topologies in Modern P2P File-Sharing Systems , 2005, IEEE/ACM Transactions on Networking.

[12]  Christos H. Papadimitriou,et al.  Heuristically Optimized Trade-Offs: A New Paradigm for Power Laws in the Internet , 2002, ICALP.

[13]  David K. Gifford,et al.  Weighted voting for replicated data , 1979, SOSP '79.

[14]  Geoffrey C. Fox,et al.  A Hybrid Keyword Search across Peer-to-Peer Federated Databases , 2004, ADBIS.

[15]  N. J. A. Sloane,et al.  Orbit and coset analysis of the Golay and related codes , 1990, IEEE Trans. Inf. Theory.

[16]  Wolfgang Nejdl,et al.  A scalable and ontology-based P2P infrastructure for Semantic Web Services , 2002, Proceedings. Second International Conference on Peer-to-Peer Computing,.

[17]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[18]  Diomidis Spinellis,et al.  A survey of peer-to-peer content distribution technologies , 2004, CSUR.

[19]  Dimitrios Tsoumakos,et al.  Adaptive probabilistic search for peer-to-peer networks , 2003, Proceedings Third International Conference on Peer-to-Peer Computing (P2P2003).

[20]  Jun Gao,et al.  Design and evaluation of a distributed scalable content discovery system , 2004, IEEE Journal on Selected Areas in Communications.

[21]  Manish Parashar,et al.  Enabling flexible queries with guarantees in P2P systems , 2004, IEEE Internet Computing.

[22]  Edith Cohen,et al.  Associative search in peer to peer networks: harnessing latent semantics , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[23]  Zhichen Xu,et al.  pSearch: information retrieval in structured overlays , 2003, CCRV.

[24]  Ben Y. Zhao,et al.  An architecture for a secure service discovery service , 1999, MobiCom.

[25]  Alfredo Cuzzocrea,et al.  XPath lookup queries in P2P networks , 2004, WIDM '04.

[26]  Márk Jelasity,et al.  A Robust and Scalable Peer-to-Peer Gossiping Protocol , 2003, AP2PC.

[27]  Richard Cole,et al.  Tree pattern matching and subset matching in randomized O(nlog3m) time , 1997, STOC '97.

[28]  David R. Karger,et al.  Chord: a scalable peer-to-peer lookup protocol for internet applications , 2003, TNET.

[29]  Anand Sivasubramaniam,et al.  Neighborhood signatures for searching P2P networks , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..

[30]  Mayank Bawa,et al.  LSH forest: self-tuning indexes for similarity search , 2005, WWW '05.

[31]  Beng Chin Ooi,et al.  PeerDB: a P2P-based system for distributed data sharing , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[32]  David R. Karger,et al.  INS/Twine: A Scalable Peer-to-Peer Architecture for Intentional Resource Discovery , 2002, Pervasive.

[33]  Yuh-Jzer Joung,et al.  Keyword Search in DHT-Based Peer-to-Peer Networks , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[34]  David Mazières,et al.  Kademlia: A Peer-to-Peer Information System Based on the XOR Metric , 2002, IPTPS.

[35]  Hector Garcia-Molina,et al.  Improving search in peer-to-peer networks , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[36]  Raouf Boutaba,et al.  Distributed pattern matching: a key to flexible and efficient P2P search , 2007, IEEE Journal on Selected Areas in Communications.

[37]  Edith Cohen,et al.  Search and replication in unstructured peer-to-peer networks , 2002, ICS '02.

[38]  Gabriel M. Kuper,et al.  The coDB Robust Peer-to-Peer Database System , 2004, SEBD.