We design and evaluate a distributed information retrieval system that operates over a mobile network where a wireless infrastructure unavailable. Such networks are common in developing nations, disaster-stricken areas, and even in the rural areas of the technologically progressive countries. This poses a new challenge for distributed IR, which normally relies on a wired Internet or always-available wireless coverage among mobile peers. In our mobile system, queries are propagated among peers only as they intermittently are in wireless range of one another. For each query received, peers retrieve top-ranked documents from their local collection and send them to the source of the query. Intermediate peers on the path to the source have to manage a finite buffer filled with documents from multiple collections and multiple queries. When too many documents are in the system, the intermediate peers must drop documents that are either unlikely to be relevant or for which a successful path to the destination is unlikely. To enable such a system, we propose a score normalization technique that works across queries and across multiple collections. We show that our method returns more relevant documents in the mobile network than existing normalization methods, which are not intended for multiple queries. Additionally, we compare our approach to existing networking algorithms for delivering data in such challenged networks. We show that although our method delivers less total documents, it delivers significantly more documents that are relevant to sources of queries.
[1]
Oliver Brock,et al.
MV routing and capacity building in disruption tolerant networks
,
2005,
Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..
[2]
W. Bruce Croft,et al.
A general language model for information retrieval
,
1999,
CIKM '99.
[3]
Matthias Grossglauser,et al.
Last Encounter Routing under Random Waypoint Mobility
,
2004,
NETWORKING.
[4]
R. Manmatha,et al.
Mobile distributed information retrieval for highly-partitioned networks
,
2003,
11th IEEE International Conference on Network Protocols, 2003. Proceedings..
[5]
Martin Vetterli,et al.
Locating Nodes with EASE: Mobility Diffusion of Last Encounters in Ad Hoc Networks
,
2003,
INFOCOM.
[6]
W. Bruce Croft,et al.
A language modeling approach to information retrieval
,
1998,
SIGIR '98.
[7]
James A. Davis,et al.
Wearable computers as packet transport mechanisms in highly-partitioned ad-hoc networks
,
2001,
Proceedings Fifth International Symposium on Wearable Computers.
[8]
Anders Lindgren,et al.
Probabilistic Routing in Intermittently Connected Networks
,
2004,
SAPIR.
[9]
Jamie Callan,et al.
DISTRIBUTED INFORMATION RETRIEVAL
,
2002
.
[10]
Luo Si,et al.
Using sampled data and regression to merge search engine results
,
2002,
SIGIR '02.
[11]
R. Manmatha,et al.
Modeling score distributions for combining the outputs of search engines
,
2001,
SIGIR '01.