Scalable Content-Based Ranking in P2P Information Retrieval

Numerous retrieval models have been defined within the field of information retrieval (IR) to produce a ranked and ordered list of documents relevant to a given query. Existing models are in general well-explored and thoroughly evaluated using traditionally centralized IR engines. However, the problem of producing global relevance scores to enable document ranking in peer-to-peer (P2P) IR systems has largely been neglected. Traditional ranking models in general require global document collection metrics such as document frequency, average document length, or the number of collection documents, which are not readily available in P2P IR systems. In this paper, we present a scalable solution for content-based ranking using global relevance scores in P2P IR systems that has been implemented as a part of ALVIS PEERS, a full-text IR engine developed for structured P2P networks. The provided experimental results show efficient and scalable performance of here proposed ranking implementation.