Understanding the Semantic Structure of Noun Phrase Queries

Determining the semantic intent of web queries not only involves identifying their semantic class, which is a primary focus of previous works, but also understanding their semantic structure. In this work, we formally define the semantic structure of noun phrase queries as comprised of intent heads and intent modifiers. We present methods that automatically identify these constituents as well as their semantic roles based on Markov and semi-Markov conditional random fields. We show that the use of semantic features and syntactic features significantly contribute to improving the understanding performance.

[1]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[2]  Patrick Pantel,et al.  Entity Extraction via Ensemble Semantics , 2009, EMNLP.

[3]  Xiao Li,et al.  Extracting structured information from user queries with semi-supervised conditional random fields , 2009, SIGIR.

[4]  Qiang Yang,et al.  Building bridges for web query classification , 2006, SIGIR.

[5]  Mirella Lapata,et al.  Proceedings of ACL-08: HLT , 2008 .

[6]  Stephen E. Robertson,et al.  Simple BM25 extension to multiple weighted fields , 2004, CIKM '04.

[7]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[8]  Fernando Diaz,et al.  Sources of evidence for vertical selection , 2009, SIGIR.

[9]  W. Bruce Croft,et al.  Analysis of Statistical Question Classification for Fact-Based Questions , 2005, Information Retrieval.

[10]  Benjamin Van Durme,et al.  Weakly-Supervised Acquisition of Open-Domain Classes and Class Attributes from Web Documents and Query Logs , 2008, ACL.

[11]  Alexandros Ntoulas,et al.  Answering web queries using structured data sources , 2009, SIGMOD Conference.

[12]  Changning Huang,et al.  Improving query translation for cross-language information retrieval using statistical models , 2001, SIGIR '01.

[13]  Wei Li,et al.  Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons , 2003, CoNLL.

[14]  Xian Zhang,et al.  Classifying What-Type Questions by Head Noun Tagging , 2008, COLING.

[15]  W. Bruce Croft,et al.  A Probabilistic Retrieval Model for Semistructured Data , 2009, ECIR.

[16]  Xiao Li,et al.  Learning query intent from regularized click graphs , 2008, SIGIR '08.

[17]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[18]  William W. Cohen,et al.  Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.

[19]  Xiao Li,et al.  Semantic Tagging of Web Search Queries , 2009, ACL.

[20]  Patrick Pantel,et al.  Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations , 2006, ACL.

[21]  Benjamin Van Durme,et al.  What You Seek Is What You Get: Extraction of Class Attributes from Query Logs , 2007, IJCAI.

[22]  Rosie Jones,et al.  The Linguistic Structure of English Web-Search Queries , 2008, EMNLP.