Distributed Information Retrieval using Keyword Auctions

This report motivates the need for large-scale distributed approaches to information retrieval, and proposes solutions based on keyword auctions.

[1]  Nick Craswell,et al.  Random walks on the click graph , 2007, SIGIR.

[2]  W. Bruce Croft,et al.  Searching distributed collections with inference networks , 1995, SIGIR '95.

[3]  David Hawking,et al.  Evaluating sampling methods for uncooperative collections , 2007, SIGIR.

[4]  Njål T. Borch,et al.  SOCIAL PEER-TO-PEER FOR SOCIAL PEOPLE , 2005 .

[5]  Zhichen Xu,et al.  pSearch: information retrieval in structured overlays , 2003, CCRV.

[6]  Santiago Chumbe,et al.  Overcoming the obstacles of harvesting and searching digital repositories from federated searching toolkits, and embedding them in VLEs , 2006 .

[7]  Christoph Baumgarten,et al.  A probabilistic model for distributed information retrieval , 1997, Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.

[8]  José Luis Vicedo González,et al.  TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[9]  James P. Callan,et al.  Query-based sampling of text databases , 2001, TOIS.

[10]  Djoerd Hiemstra,et al.  The Importance of Prior Probabilities for Entry Page Search , 2002, SIGIR '02.

[11]  J. Giles Internet encyclopaedias go head to head , 2005, Nature.

[12]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[13]  Amin Vahdat,et al.  Efficient Peer-to-Peer Keyword Searching , 2003, Middleware.

[14]  Jie Lu,et al.  User modeling for full-text federated search in peer-to-peer networks , 2006, SIGIR '06.

[15]  Karl Aberer,et al.  A Peer-to-Peer Architecture for Information Retrieval Across Digital Library Collections , 2006, ECDL.

[16]  Luo Si,et al.  A semisupervised learning method to merge search engine results , 2003, TOIS.

[17]  Djoerd Hiemstra,et al.  Parsimonious language models for information retrieval , 2004, SIGIR '04.

[18]  Jaap Kamps Experiments with Document and Query Representations for a Terabyte of Text , 2006, TREC.

[19]  Djoerd Hiemstra,et al.  Creating an Information Retrieval test corpus for Dutch , 2002 .

[20]  Djoerd Hiemstra,et al.  Retrieving Web Pages Using Content, Links, URLs and Anchors , 2001, TREC.

[21]  Charles L. A. Clarke,et al.  The TREC 2006 Terabyte Track , 2006, TREC.

[22]  Djoerd Hiemstra,et al.  A probabilistic justification for using tf×idf term weighting in information retrieval , 2000, International Journal on Digital Libraries.

[23]  Djoerd Hiemstra,et al.  PFTijah: text search in an XML database system , 2006 .

[24]  Djoerd Hiemstra,et al.  TIJAH: Embracing IR Methods in XML Databases , 2005, Information Retrieval.

[25]  Wolf-Tilo Balke,et al.  Progressive distributed top-k retrieval in peer-to-peer networks , 2005, 21st International Conference on Data Engineering (ICDE'05).

[26]  R. Akavipat,et al.  Emerging semantic communities in peer web search , 2006, P2PIR '06.

[27]  Georgios Paltoglou,et al.  Results Merging Algorithm Using Multiple Regression Models , 2007, ECIR.

[28]  Gerhard Weikum,et al.  IQN Routing: Integrating Quality and Novelty in P2P Querying and Ranking , 2006, EDBT.

[29]  David D. Jensen,et al.  Creating social networks to improve peer-to-peer networking , 2005, KDD '05.

[30]  B. Huberman,et al.  The Deep Web : Surfacing Hidden Value , 2000 .

[31]  Joemon M. Jose,et al.  An architecture for information retrieval over semi-collaborating Peer-to-Peer networks , 2004, SAC '04.

[32]  Carol Peters,et al.  Cross-Language Evaluation Forum: Objectives, Results, Achievements , 2004, Information Retrieval.

[33]  Jayant Madhavan,et al.  Structured Data Meets the Web: A Few Observations , 2006, IEEE Data Eng. Bull..

[34]  Milad Shokouhi,et al.  Using query logs to establish vocabularies in distributed information retrieval , 2007, Inf. Process. Manag..

[35]  David Maier,et al.  From databases to dataspaces: a new abstraction for information management , 2005, SGMD.

[36]  Yong Yang,et al.  Performance of Full Text Search in Structured and Unstructured Peer-to-Peer Systems , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[37]  Koen Tinselboer The use of rare key indexing for distributed web search , 2007 .

[38]  Karl Aberer,et al.  ALVIS peers: a scalable full-text peer-to-peer retrieval engine , 2006, P2PIR '06.

[39]  David Hawking,et al.  Challenges in Enterprise Search , 2004, ADC.

[40]  Fredrick Marckini Search Engine Positioning , 2001 .

[41]  Djoerd Hiemstra,et al.  Statistical Language Models for Intelligent XML Retrieval , 2003, Intelligent Search on XML Data.

[42]  Jun Wang,et al.  TRIBLER: a social‐based peer‐to‐peer system , 2008, IPTPS.

[43]  Luo Si,et al.  The FedLemur project: Federated search in the real world , 2006 .

[44]  Djoerd Hiemstra,et al.  A Language Modeling Approach to TREC , 2005 .

[45]  Luis Gravano,et al.  QProber: A system for automatic classification of hidden-Web databases , 2003, TOIS.

[46]  R. Vohra,et al.  Algorithmic Game Theory: Sponsored Search Auctions , 2007 .

[47]  Gerhard Weikum,et al.  MINERVA: Collaborative P2P Search , 2005, VLDB.

[48]  Richard P. Martin,et al.  PlanetP: using gossiping to build content addressable peer-to-peer information sharing communities , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[49]  Djoerd Hiemstra,et al.  Score region algebra: building a transparent XML-R database , 2005, CIKM '05.

[50]  Torsten Suel,et al.  Efficient query evaluation on large textual collections in a peer-to-peer environment , 2005, Fifth IEEE International Conference on Peer-to-Peer Computing (P2P'05).

[51]  Milad Shokouhi,et al.  Updating collection representations for federated search , 2007, SIGIR.

[52]  Milad Shokouhi,et al.  Segmentation of Search Engine Results for Effective Data-Fusion , 2007, ECIR.

[53]  Steve R. Waterhouse,et al.  Distributed Search in P2P Networks , 2002, IEEE Internet Comput..

[54]  Norbert Fuhr,et al.  A Decision-Theoretic Model for Decentralised Query Routing in Hierarchical Peer-to-Peer Networks , 2007, ECIR.

[55]  Nick Craswell,et al.  Overview of the TREC 2005 Enterprise Track , 2005, TREC.

[56]  W. Bruce Croft,et al.  Cluster-based language models for distributed retrieval , 1999, SIGIR '99.

[57]  Djoerd Hiemstra,et al.  Structured Text Retrieval Models , 2009, Encyclopedia of Database Systems.

[58]  Peter Bailey,et al.  Server selection on the World Wide Web , 2000, DL '00.

[59]  Djoerd Hiemstra,et al.  The TIJAH XML information retrieval system , 2006, SIGIR '06.

[60]  Torsten Suel,et al.  ODISSEA: A Peer-to-Peer Architecture for Scalable Web Search and Information Retrieval , 2003, WebDB.

[61]  Jie Lu,et al.  Full-text federated search of text-based digital libraries in peer-to-peer networks , 2006, Information Retrieval.

[62]  Wray L. Buntine,et al.  Standards for Open Source Information Retrieval , 2022 .

[63]  Aranyak Mehta,et al.  AdWords and generalized on-line matching , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[64]  Djoerd Hiemstra,et al.  Term-specific smoothing for the language modeling approach to information retrieval: the importance of a query term , 2002, SIGIR '02.

[65]  David R. Karger,et al.  On the Feasibility of Peer-to-Peer Web Indexing and Search , 2003, IPTPS.

[66]  Gabriella Kazai,et al.  Report on the ad-hoc track of the INEX 2005 workshop , 2006, SIGF.

[67]  Djoerd Hiemstra,et al.  Using language models for information retrieval , 2001 .

[68]  Sponsored Search , 2010, Encyclopedia of Machine Learning.

[69]  Massimo Melucci,et al.  A Study of a Weighting Scheme for Information Retrieval in Hierarchical Peer-to-Peer Networks , 2007, ECIR.

[70]  Charles L. A. Clarke,et al.  The TREC 2005 Terabyte Track , 2005, TREC.

[71]  Tim O'Reilly,et al.  What is Web 2.0: Design Patterns and Business Models for the Next Generation of Software , 2007 .

[72]  Jamie Callan,et al.  DISTRIBUTED INFORMATION RETRIEVAL , 2002 .

[73]  Torsten Grust,et al.  MonetDB/XQuery: a fast XQuery processor powered by a relational engine , 2006, SIGMOD Conference.

[74]  C. Lee Giles,et al.  Accessibility of information on the web , 1999, Nature.

[75]  Prabhakar Raghavan,et al.  Navigating large-scale semi-structured data in business portals , 2001, VLDB.

[76]  Fabio Crestani,et al.  Adaptive Query-Based Sampling of Distributed Collections , 2006, SPIRE.