Mining Entity Types from Query Logs via User Intent Modeling

We predict entity type distributions in Web search queries via probabilistic inference in graphical models that capture how entity-bearing queries are generated. We jointly model the interplay between latent user intents that govern queries and unobserved entity types, leveraging observed signals from query formulations and document clicks. We apply the models to resolve entity types in new queries and to assign prior type distributions over an existing knowledge base. Our models are efficiently trained using maximum likelihood estimation over millions of real-world Web search queries. We show that modeling user intent significantly improves entity type resolution for head queries over the state of the art, on several metrics, without degradation in tail query performance.

[1]  Vincent Ng,et al.  Inducing Fine-Grained Semantic Classes via Hierarchical and Collective Classification , 2010, COLING.

[2]  Ellen Riloff,et al.  Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs , 2008, ACL.

[3]  Enrique Alfonseca,et al.  Acquisition of instance attributes via labeled and related instances , 2010, SIGIR.

[4]  Gang Wang,et al.  Understanding user's query intent with wikipedia , 2009, WWW '09.

[5]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[6]  Ricardo A. Baeza-Yates,et al.  Query Recommendation Using Query Logs in Search Engines , 2004, EDBT Workshops.

[7]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[8]  Jianfeng Gao,et al.  Clickthrough-based latent semantic models for web search , 2011, SIGIR.

[9]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[10]  Hang Li,et al.  Named entity recognition in query , 2009, SIGIR.

[11]  Fabio Crestani,et al.  Towards query log based personalization using topic models , 2010, CIKM.

[12]  Olfa Nasraoui,et al.  Mining search engine query logs for query recommendation , 2006, WWW '06.

[13]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[14]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[15]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[16]  Xiaoxin Yin,et al.  Building taxonomy of web search intents for name entity queries , 2010, WWW '10.

[17]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[18]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[19]  Satoshi Sekine,et al.  Acquiring ontological knowledge from query logs , 2007, WWW '07.

[20]  Amanda Spink,et al.  Determining the user intent of web search engine queries , 2007, WWW '07.

[21]  Patrick Pantel,et al.  Discovering word senses from text , 2002, KDD.

[22]  Eduard Hovy,et al.  Towards terascale knowledge acquisition , 2004, COLING 2004.

[23]  Filip Radlinski,et al.  Inferring query intent from reformulations and clicks , 2010, WWW '10.

[24]  Oren Etzioni,et al.  What Is This, Anyway: Automatic Hypernym Discovery , 2009, AAAI Spring Symposium: Learning by Reading and Learning to Read.

[25]  Hinrich Schütze,et al.  Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition , 2011, ACL.

[26]  Marco Pennacchiotti,et al.  Domain-independent entity extraction from web search query logs , 2011, WWW.

[27]  Marius Pasca,et al.  Weakly-supervised discovery of named entities using web search queries , 2007, CIKM '07.