Processing Queries in a Large Peer-to-Peer System

While current search engines seem to easily handle the size of the data available on the Internet, they cannot provide fresh results. The most up-to-date data always resides on the data sources. Efficiently interconnecting data providers, however, is not an easy problem. Peer-to-peer computing is the latest technology to address this problem. However, efficient query processing in peer-to-peer networks remains an open research area. In this paper, we present a performance study of a system that facilitates efficient searches of large numbers of independent data providers on the Internet. In our scenario, each data provider becomes an autonomous node in a large peer-to-peer system. Using small indices on each node, we can efficiently direct queries submitted on any node to the relevant sources. Experiments with a large peer-to-peer network demonstrate the feasibility of our approach.

[1]  DruschelPeter,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001 .

[2]  Peter Druschel,et al.  Storage management and caching in PAST , 2001 .

[3]  Ioana Manolescu,et al.  The XML benchmark project , 2001 .

[4]  Hector Garcia-Molina,et al.  Designing a super-peer network , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[5]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[6]  James P. Callan,et al.  Automatic discovery of language models for text databases , 1999, SIGMOD '99.

[7]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[8]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD '00.

[9]  Dean Daniels,et al.  R*: An Overview of the Architecture , 1986, JCDKB.

[10]  David J. DeWitt,et al.  The Niagara Internet Query System , 2001, IEEE Data Eng. Bull..

[11]  Serge Abiteboul,et al.  Monitoring XML data on the Web , 2001, SIGMOD '01.

[12]  Dan Suciu,et al.  What Can Database Do for Peer-to-Peer? , 2001, WebDB.

[13]  Steven R. Waterhouse Jxta search:distributed search for distributed networks , 2001 .

[14]  Alfons Kemper,et al.  ObjectGlobe: Ubiquitous query processing on the Internet , 2001, The VLDB Journal.

[15]  Patrick Valduriez,et al.  Principles of Distributed Database Systems, Second Edition , 1999 .

[16]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[17]  Hector Garcia-Molina,et al.  Routing indices for peer-to-peer systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[18]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[19]  Hector Garcia-Molina,et al.  Efficient search in peer to peer networks , 2004 .

[20]  Hector Garcia-Molina,et al.  Improving search in peer-to-peer networks , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[21]  Hector Garcia-Molina,et al.  Comparing Hybrid Peer-to-Peer Systems , 2001, VLDB.

[22]  Luis Gravano,et al.  Probe, count, and classify: categorizing hidden web databases , 2001, SIGMOD '01.

[23]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD 2000.

[24]  David J. DeWitt,et al.  On supporting containment queries in relational database management systems , 2001, SIGMOD '01.

[25]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.