Distributed Information Retrieval: A Multi-Objective Resource Selection Approach

Information retrieval is becoming increasingly concerned with resource selection and data fusion for distributed archives. In distributed information retrieval, a user submits a query to a broker, which determines a solution for how to yield a given number of documents from all available resources. In this paper, we present a multi-objective model for resource selection, in which four aspects: a document's relevance to the given query, time, monetary cost, and the chance of getting document duplicates from resources, are considered simultaneously. Some variants of this multi-objective model, aimed at achieving better implementation efficiency, are also proposed.

[1]  James C. French,et al.  Comparing the performance of database selection algorithms , 1999, SIGIR '99.

[2]  Arkady B. Zaslavsky,et al.  Document overlap detection system for distributed digital libraries , 2000, DL '00.

[3]  Peter Bailey,et al.  Server selection on the World Wide Web , 2000, DL '00.

[4]  James C. French,et al.  The impact of database selection on distributed searching , 2000, SIGIR '00.

[5]  James P. Callan,et al.  Automatic discovery of language models for text databases , 1999, SIGMOD '99.

[6]  Norbert Fuhr,et al.  A decision-theoretic approach to database selection in networked IR , 1999, TOIS.

[7]  Divesh Srivastava,et al.  The Information Manifold , 1995 .

[8]  Luis Gravano,et al.  Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies , 1995, VLDB.

[9]  W. Bruce Croft,et al.  Searching distributed collections with inference networks , 1995, SIGIR '95.

[10]  King-Lup Liu,et al.  A Statistical Method for Estimating the Usefulness of Text Databases , 2002, IEEE Trans. Knowl. Data Eng..

[11]  Dik Lun Lee,et al.  Server Ranking for Distributed Text Retrieval Systems on the Internet , 1997, DASFAA.

[12]  Divyakant Agrawal,et al.  Pharos: a scalable distributed architecture for locating heterogeneous information sources , 1997, CIKM '97.

[13]  Hector Garcia-Molina,et al.  SCAM: A Copy Detection Mechanism for Digital Documents , 1995, DL.

[14]  Singiresu S. Rao,et al.  Optimization Theory and Applications , 1980, IEEE Transactions on Systems, Man, and Cybernetics.

[15]  David Hawking,et al.  Methods for information server selection , 1999, TOIS.

[16]  Anil S. Chakravarthy,et al.  NetSerf: using semantic knowledge to find Internet information archives , 1995, SIGIR '95.