Architectural Alternatives for Information Filtering in Structured Overlays

Content providers are naturally distributed and produce large amounts of new information every day. Peer-to-peer information filtering is a promising approach that offers scalability, adaptivity to high dynamics, and failure resilience. The authors developed two approaches that utilize the chord distributed hash table as the routing substrate, but one stresses retrieval effectiveness, whereas the other relaxes recall guarantees to achieve lower message traffic and thus better scalability. This article highlights the two approaches' main characteristics, presents the issues and trade-offs involved in their design, and compares them in terms of scalability, efficiency, and filtering effectiveness.

[1]  David R. Karger,et al.  On the Feasibility of Peer-to-Peer Web Indexing and Search , 2003, IPTPS.

[2]  Norbert Fuhr,et al.  Evaluating different methods of estimating retrieval quality for resource selection , 2003, SIGIR.

[3]  Scott Shenker,et al.  Querying the Internet with PIER , 2003, VLDB.

[4]  Gerhard Weikum,et al.  MAPS: approximate publish/subscribe functionality in peer-to-peer networks , 2006, ADPUC '06.

[5]  Manolis Koubarakis,et al.  Information Alert in Distributed Digital Libraries: The Models, Languages, and Architecture of DIAS , 2002, ECDL.

[6]  Zhichen Xu,et al.  pFilter: Global Information Filtering and Dissemination , 2002 .

[7]  Ling Liu,et al.  PeerCQ: a decentralized and self-configuring peer-to-peer information monitoring system , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[8]  Chris Chatfield,et al.  The Analysis of Time Series: An Introduction , 1981 .

[9]  Jamie Callan,et al.  DISTRIBUTED INFORMATION RETRIEVAL , 2002 .

[10]  Gerhard Weikum,et al.  Improving collection selection with overlap awareness in P2P search engines , 2005, SIGIR '05.

[11]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM 2001.

[12]  Gerhard Weikum,et al.  Discovering and exploiting keyword and attribute-value co-occurrences to improve P2P routing indices , 2006, CIKM '06.

[13]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[14]  Manolis Koubarakis,et al.  Filtering algorithms for information retrieval models with named attributes and proximity operators , 2004, SIGIR '04.

[15]  Srinivasan Seshan,et al.  Mercury: supporting scalable multi-attribute range queries , 2004, SIGCOMM '04.

[16]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[17]  David R. Karger,et al.  OverCite: A Cooperative Digital Research Library , 2005, IPTPS.

[18]  Peter Triantafillou,et al.  PastryStrings: A Comprehensive Content-Based Publish/Subscribe DHT Network , 2006, 26th IEEE International Conference on Distributed Computing Systems (ICDCS'06).

[19]  Manolis Koubarakis,et al.  Publish/subscribe functionality in IR environments using structured overlay networks , 2005, SIGIR '05.

[20]  Annika Hinze,et al.  Hermes: a notification service for digital libraries , 2001, JCDL '01.

[21]  Divyakant Agrawal,et al.  Meghdoot: Content-Based Publish/Subscribe over P2P Networks , 2004, Middleware.

[22]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[23]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[24]  Miguel Castro,et al.  SCRIBE: The Design of a Large-Scale Event Notification Infrastructure , 2001, Networked Group Communication.

[25]  Zhichen Xu,et al.  pFilter: global information filtering and dissemination using structured overlay networks , 2003, The Ninth IEEE Workshop on Future Trends of Distributed Computing Systems, 2003. FTDCS 2003. Proceedings..

[26]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[27]  Gerhard Weikum,et al.  Improving Collection Selection with Overlap-Awareness , 2005 .

[28]  Gerhard Weikum,et al.  MinervaDL: An Architecture for Information Retrieval and Filtering in Distributed Digital Libraries , 2007, ECDL.

[29]  Manolis Koubarakis,et al.  LibraRing: An Architecture for Distributed Digital Libraries Based on DHTs , 2005, ECDL.