Dynamic Collective Entity Representations for Entity Ranking

Entity ranking, i.e., successfully positioning a relevant entity at the top of the ranking for a given query, is inherently difficult due to the potential mismatch between the entity's description in a knowledge base, and the way people refer to the entity when searching for it. To counter this issue we propose a method for constructing dynamic collective entity representations. We collect entity descriptions from a variety of sources and combine them into a single entity representation by learning to weight the content from different sources that are associated with an entity for optimal retrieval effectiveness. Our method is able to add new descriptions in real time and learn the best representation as time evolves so as to capture the dynamics of how people search entities. Incorporating dynamic description sources into dynamic collective entity representations improves retrieval effectiveness by 7% over a state-of-the-art learning to rank baseline. Periodic retraining of the ranker enables higher ranking effectiveness for dynamic collective entity representations.

[1]  Gianluca Demartini,et al.  Overview of the INEX 2008 Entity Ranking Track , 2009, INEX.

[2]  Wei Yuan,et al.  Smoothing clickthrough data for web search ranking , 2009, SIGIR.

[3]  Yong Yu,et al.  Optimizing web search using social annotations , 2007, WWW '07.

[4]  W. Bruce Croft,et al.  Incorporating social anchors for ad hoc retrieval , 2013, OAIR.

[5]  Amit Singhal,et al.  Document expansion for speech retrieval , 1999, SIGIR '99.

[6]  Hugh E. Williams,et al.  Query association surrogates for Web search: Research Articles , 2004 .

[7]  Katja Hofmann,et al.  A probabilistic method for inferring preferences from clicks , 2011, CIKM '11.

[8]  Jaap Kamps,et al.  Entity ranking using Wikipedia as a pivot , 2010, CIKM.

[9]  Gilad Mishne,et al.  Twanchor text: a preliminary study of the value of tweets as anchor text , 2012, SIGIR '12.

[10]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[11]  Mounia Lalmas,et al.  Overview of the INEX 2007 Entity Ranking Track , 2008, INEX.

[12]  M. de Rijke,et al.  Adding semantics to microblog posts , 2012, WSDM '12.

[13]  David Hawking,et al.  Using anchor text for homepage and topic distillation search tasks , 2012, J. Assoc. Inf. Sci. Technol..

[14]  Maarten de Rijke,et al.  News Comments: Exploring, Modeling, and Online Prediction , 2010, ECIR.

[15]  Andrei Z. Broder,et al.  Exploiting site-level information to improve web search , 2010, CIKM '10.

[16]  Craig MacDonald,et al.  About learning models with multiple query-dependent features , 2013, TOIS.

[17]  Djoerd Hiemstra,et al.  Retrieving Web Pages Using Content, Links, URLs and Anchors , 2001, TREC.

[18]  M. de Rijke,et al.  Dynamic Collective Entity Representations for Entity Ranking , 2016, WSDM.

[19]  Wei-Ying Ma,et al.  Optimizing web search using web click-through data , 2004, CIKM '04.

[20]  Krisztian Balog,et al.  Overview of the TREC 2010 Entity Track , 2010, TREC.

[21]  Jasmine Novak,et al.  Building enriched document representations using aggregated anchor text , 2009, SIGIR.

[22]  Kotagiri Ramamohanarao,et al.  Long-Term Learning for Web Search Engines , 2002, PKDD.

[23]  Katrina Fenlon,et al.  Improving retrieval of short texts through document expansion , 2012, SIGIR '12.

[24]  Hugh E. Williams,et al.  Query association surrogates for Web search , 2004, J. Assoc. Inf. Sci. Technol..

[25]  Stephen E. Robertson,et al.  Simple BM25 extension to multiple weighted fields , 2004, CIKM '04.

[26]  Paul Thomas,et al.  Overview of the TREC 2009 Entity Track , 2009, TREC.

[27]  Kilian Q. Weinberger,et al.  Web-Search Ranking with Initialized Gradient Boosted Regression Trees , 2010, Yahoo! Learning to Rank Challenge.

[28]  David Konopnicki,et al.  Queries as anchors: selection by association , 2005, HYPERTEXT '05.

[29]  Kevin S. McCurley,et al.  Analysis of anchor text for web search , 2003, SIGIR.

[30]  Krisztian Balog,et al.  On the use of semantic knowledge bases for temporally-aware entity retrieval , 2012, ESAIR '12.

[31]  Stephen E. Robertson,et al.  Microsoft Cambridge at TREC 13: Web and Hard Tracks , 2004, TREC.

[32]  Christopher J. C. Burges,et al.  A machine learning approach for improved BM25 retrieval , 2009, CIKM.

[33]  James A. Thom,et al.  Entity ranking in Wikipedia , 2007, SAC '08.

[34]  Dilek Z. Hakkani-Tür,et al.  Entity ranking for descriptive queries , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[35]  Bernardo A. Huberman,et al.  Predicting the popularity of online content , 2008, Commun. ACM.

[36]  Themis Palpanas,et al.  Entity ranking using click-log information , 2013, Intell. Data Anal..

[37]  Jiawei Han,et al.  Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions , 2015, IEEE Transactions on Knowledge and Data Engineering.

[38]  Mong-Li Lee,et al.  Entity profiling with varying source reliabilities , 2014, KDD.

[39]  MottinDavide,et al.  Entity ranking using click-log information , 2013, IDA 2013.

[40]  Ravi Kumar,et al.  A characterization of online browsing behavior , 2010, WWW '10.

[41]  Andrew McCallum,et al.  Automatic Categorization of Email into Folders: Benchmark Experiments on Enron and SRI Corpora , 2005 .

[42]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[43]  M. de Rijke,et al.  Category-Based Query Modeling for Entity Search , 2010, ECIR.

[44]  Christoph Meinel,et al.  The Metadata Triumvirate: Social Annotations, Anchor Texts and Search Queries , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[45]  Ralf Krestel,et al.  Ranking Entities Using Web Search Query Logs , 2010, ECDL.

[46]  SzaboGabor,et al.  Predicting the popularity of online content , 2010 .

[47]  Gianluca Demartini,et al.  Overview of the INEX 2009 Entity Ranking Track , 2009, INEX.

[48]  山田 育矢 Entity linking with a knowledge base(審査報告) , 2016 .