/ Subscribe Functionalities for Future Digital Libraries using Structured Overlay Networks

We are interested in the problem of distributed resource sharing in future digital libraries (DLs). We adopt a pure P2P architecture (illustrated in Figure 1), but our ideas can be easily modified to work in the case of hierarchical P2P networks, as in [3]. Information providers (DLs) and information consumers (users) are both represented by peers participating in a peer-to-peer (P2P) overlay network. There are two kinds of basic functionality that we expect this architecture to offer: information retrieval (IR) and publish/subscribe (pub/sub). In an IR scenario a user poses a query (e.g., “I am interested in papers on bio-informatics”) and the system returns information about matching resources. In a pub/sub scenario (also known as information filtering (IF) or selective dissemination of information (SDI)) a user posts a subscription (or profile or continuous query) to the system to receive notifications whenever certain events of interest take place (e.g., when a paper on bio-informatics becomes available). In this extended abstract we concentrate on the latter kind of functionality (pub/sub) and sketch how to provide it by extending the distributed hash table Chord [4]. Distributed Hash Tables (DHTs) are the second generation structured P2P overlay networks devised as a remedy for the known limitations of earlier P2P networks such as Napster and Gnutella. We present a set of protocols, collectively called DHTrie, that extend the Chord protocols with pub/sub functionality. We assume that resources are annotated using a well-understood attributevalue model called AWPS in [2]. Thus publications and subscriptions will also be expressed in AWPS. AWPS is based on named attributes with value free text interpreted under the Boolean and vector space (VSM) models. The query language of AWPS allows Boolean combinations of comparisons A op v, where A is an attribute, v is a text value and op is one of the operators “equals”, “contains” or “similar” (“equals” and “contains” are Boolean operators and “similar” is interpreted using the VSM or LSI model). The following is an example of a publication in AWPS: { (AUTHOR, “John Smith”), (TITLE, “Information dissemination in P2P ...”), (ABSTRACT, “In this paper we show that ...”) }