Query Interpretations from Entity-Linked Segmentations

Web search queries can be ambiguous: is source of the nilemeant to find information on the actual river or on a board game of that name? We tackle this problem by deriving entity-based query interpretations: given some query, the task is to derive all reasonable ways of linking suitable parts of the query to semantically compatible entities in a background knowledge base. Our suggested approach focuses on effectiveness but also on efficiency since web search response times should not exceed some hundreds of milliseconds. In our approach, we use query segmentation as a preprocessing step that finds promising segment-based “interpretation skeletons”. The individual segments from these skeletons are then linked to entities from a knowledge base and the reasonable combinations are ranked in a final step. An experimental comparison on a combined corpus of all existing query entity linking datasets shows our approach to have a better interpretation accuracy at a better run time than the previously most effective methods.

[1]  Rishiraj Saha Roy,et al.  An IR-based evaluation framework for web search query segmentation , 2012, SIGIR '12.

[2]  Paolo Ferragina,et al.  From TagME to WAT: a new entity annotator , 2014, ERD '14.

[3]  Krisztian Balog,et al.  Exploiting Entity Linking in Queries for Entity Retrieval , 2016, ICTIR.

[4]  Matthias Hagen,et al.  Query segmentation revisited , 2011, WWW.

[5]  Krisztian Balog,et al.  REL: An Entity Linker Standing on the Shoulders of Giants , 2020, SIGIR.

[6]  Jens Lehmann,et al.  Old is Gold: Linguistic Driven Approach for Entity and Relation Linking of Short Text , 2019, NAACL.

[7]  Krisztian Balog,et al.  Nordlys: A Toolkit for Entity-Oriented and Semantic Search , 2017, SIGIR.

[8]  Laura Dietz,et al.  ENT Rank: Retrieving Entities for Topical Information Needs through Entity-Neighbor-Text Relations , 2019, SIGIR.

[9]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[10]  Hinrich Schütze,et al.  A Piggyback System for Joint Entity Mention Detection and Linking in Web Queries , 2016, WWW.

[11]  Fabian M. Suchanek,et al.  YAGO3: A Knowledge Base from Multilingual Wikipedias , 2015, CIDR.

[12]  James P. Callan,et al.  JointSem: Combining Query Entity Linking and Entity based Document Ranking , 2017, CIKM.

[13]  Ebrahim Bagheri,et al.  Learning to rank implicit entities on Twitter , 2021, Inf. Process. Manag..

[14]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[15]  Tie-Yan Liu,et al.  Bag-of-Entities Representation for Ranking , 2016, ICTIR.

[16]  Peter Mika Entity search on the web , 2013, WWW '13 Companion.

[17]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[18]  Krisztian Balog,et al.  A test collection for entity search in DBpedia , 2013, SIGIR.

[19]  Masatoshi Yoshikawa,et al.  Entity Ranking for Queries with Modifiers Based on Knowledge Bases and Web Search Results , 2018, IEICE Trans. Inf. Syst..

[20]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[21]  Matthias Hagen,et al.  Towards optimum query segmentation: in doubt without , 2012, CIKM '12.

[22]  Marieke van Erp,et al.  Lessons learnt from the Named Entity rEcognition and Linking (NEEL) challenge series , 2017, Semantic Web.

[23]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[24]  Kenneth Ward Church,et al.  Heavy-tailed distributions and multi-keyword queries , 2007, SIGIR.

[25]  ChengXiang Zhai,et al.  Unsupervised query segmentation using clickthrough for information retrieval , 2011, SIGIR '11.

[26]  Andrei Z. Broder,et al.  Sampling Search-Engine Results , 2005, WWW '05.

[27]  Jianfeng Gao,et al.  Exploring web scale language models for search query processing , 2010, WWW '10.

[28]  Krisztian Balog,et al.  Identifying and exploiting target entity type information for ad hoc entity retrieval , 2018, Information Retrieval Journal.

[29]  Hsin-Hsi Chen,et al.  NTUNLP approaches to recognizing and disambiguating entities in long and short text at the ERD challenge 2014 , 2014, ERD '14.

[30]  Hang Li,et al.  Named entity recognition in query , 2009, SIGIR.

[31]  Salvatore Orlando,et al.  Dexter: an open source framework for entity linking , 2013, ESAIR '13.

[32]  Hiroyuki Shindo,et al.  Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation , 2016, CoNLL.

[33]  Wei Chu,et al.  Learning to Recommend Related Entities to Search Users , 2015, WSDM.

[34]  Krisztian Balog,et al.  Dynamic Factual Summaries for Entity Cards , 2017, SIGIR.

[35]  Soumen Chakrabarti,et al.  Learning joint query interpretation and response ranking , 2013, WWW '13.

[36]  Paolo Ferragina,et al.  TAGME: on-the-fly annotation of short text fragments (by wikipedia entities) , 2010, CIKM.

[37]  Krisztian Balog,et al.  DBpedia-Entity v2: A Test Collection for Entity Search , 2017, SIGIR.

[38]  Giuseppe Ottaviano,et al.  Fast and Space-Efficient Entity Linking for Queries , 2015, WSDM.

[39]  Oren Kurland,et al.  Document Retrieval Using Entity-Based Language Models , 2016, SIGIR.

[40]  Peter Boros,et al.  Query Segmentation for Web Search , 2003, WWW.

[41]  Doug Downey,et al.  Local and Global Algorithms for Disambiguation to Wikipedia , 2011, ACL.

[42]  Haifeng Wang,et al.  Learning to Recommend Related Entities With Serendipity for Web Search Users , 2018, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[43]  Gerhard Weikum,et al.  AIDA-light: High-Throughput Named-Entity Disambiguation , 2014, LDOW.

[44]  Satoshi Sekine,et al.  Extended Named Entity Hierarchy , 2002, LREC.

[45]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[46]  Krisztian Balog,et al.  Entity Linking in Queries: Tasks and Evaluation , 2015, ICTIR.

[47]  Rishiraj Saha Roy,et al.  Unsupervised query segmentation using only query logs , 2011, WWW.

[48]  Benjamin Rey,et al.  Generating query substitutions , 2006, WWW '06.

[49]  Robert Jäschke,et al.  "The Michael Jordan of Greatness": Extracting Vossian Antonomasia from Two Decades of the New York Times, 1987-2007 , 2019, Digit. Scholarsh. Humanit..

[50]  James P. Callan,et al.  EsdRank: Connecting Query and Documents through External Semi-Structured Data , 2015, CIKM.

[51]  Krisztian Balog,et al.  Entity-Oriented Search , 2018, The Information Retrieval Series.

[52]  Heiko Paulheim,et al.  Evaluating Entity Linking: An Analysis of Current Benchmark Datasets and a Roadmap for Doing a Better Job , 2016, LREC.

[53]  Denny Vrandecic,et al.  Wikidata: a new platform for collaborative data collection , 2012, WWW.

[54]  James P. Callan,et al.  Combining document representations for known-item search , 2003, SIGIR.

[55]  Peter Mika,et al.  Ad-hoc object retrieval in the web of data , 2010, WWW '10.

[56]  Amit P. Sheth,et al.  Implicit Entity Linking in Tweets , 2016, ESWC.

[57]  Enhong Chen,et al.  A new approach to query segmentation for relevance ranking in web search , 2013, Information Retrieval Journal.

[58]  Heng Ji,et al.  Knowledge Base Population: Successful Approaches and Challenges , 2011, ACL.

[59]  Roland Vollgraf,et al.  Contextual String Embeddings for Sequence Labeling , 2018, COLING.