论文信息 - Combining inverted indices and structured search for ad-hoc object retrieval

Combining inverted indices and structured search for ad-hoc object retrieval

Retrieving semi-structured entities to answer keyword queries is an increasingly important feature of many modern Web applications. The fast-growing Linked Open Data (LOD) movement makes it possible to crawl and index very large amounts of structured data describing hundreds of millions of entities. However, entity retrieval approaches have yet to find efficient and effective ways of ranking and navigating through those large data sets. In this paper, we address the problem of Ad-hoc Object Retrieval over large-scale LOD data by proposing a hybrid approach that combines IR and structured search techniques. Specifically, we propose an architecture that exploits an inverted index to answer keyword queries as well as a semi-structured database to improve the search effectiveness by automatically generating queries over the LOD graph. Experimental results show that our ranking algorithms exploiting both IR and graph indices outperform state-of-the-art entity retrieval techniques by up to 25% over the BM25 baseline.

Gianluca Demartini | Philippe Cudré-Mauroux | Alberto Tonon

[1] Krisztian Balog,et al. Overview of the TREC 2011 Entity Track , 2011, TREC.

[2] M. de Rijke,et al. Ranking related entities: components and analyses , 2010, CIKM.

[3] M. de Rijke,et al. Query modeling for entity search based on terms, categories, and examples , 2011, TOIS.

[4] William W. Cohen,et al. A Comparison of String Metrics for Matching Names and Records , 2003 .

[5] Roi Blanco,et al. Effective and Efficient Entity Search in RDF Data , 2011, SEMWEB.

[6] Emine Yilmaz,et al. Estimating average precision with incomplete and imperfect judgments , 2006, CIKM '06.

[7] Justin Zobel,et al. How reliable are the results of large-scale information retrieval experiments? , 1998, SIGIR '98.

[8] M. de Rijke,et al. A language modeling framework for expert finding , 2009, Inf. Process. Manag..

[9] Ellen M. Voorhees,et al. The Philosophy of Information Retrieval Evaluation , 2001, CLEF.

[10] Peter Bailey,et al. Overview of the TREC 2008 Enterprise Track , 2008, TREC.

[11] Wolfgang Nejdl,et al. Leveraging personal metadata for Desktop search: The Beagle++ system , 2010, J. Web Semant..