Semantic Inversion in XML Keyword Search with General Conditional Random Fields

Keyword search has been widely used in information retrieval systems, such as search engines. However, the input retrieval keywords are so ambiguous that we can hardly know the retrieval intent explicitly. Therefore, how to inverse keywords into semantic is meaningful. In this paper, we clearly define the Semantic Inversion problem in XML keyword search and solve it with General Conditional Random Fields. Our algorithm concerns different categories of relevance and provides the alternative label sequences corresponding to the retrieval keywords. The results of experiments show that our algorithm is effective and 12% higher than the baseline in terms of precision.

[1]  William W. Cohen,et al.  Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.

[2]  Claire Cardie,et al.  Identifying Expressions of Opinion in Context , 2007, IJCAI.

[3]  Sandeep Pandey,et al.  Unsupervised extraction of template structure in web search queries , 2012, WWW.

[4]  Stanley F. Chen,et al.  A Gaussian Prior for Smoothing Maximum Entropy Models , 1999 .

[5]  Wolfgang Nejdl,et al.  From keywords to semantic queries - Incremental query construction on the semantic web , 2009, J. Web Semant..

[6]  Haofen Wang,et al.  Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-Shaped (RDF) Data , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[7]  Jun'ichi Tsujii,et al.  Improving the Scalability of Semi-Markov Conditional Random Fields for Named Entity Recognition , 2006, ACL.

[8]  Wei Li,et al.  Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons , 2003, CoNLL.

[9]  Katja Hose,et al.  Partout: a distributed engine for efficient RDF processing , 2012, WWW.

[10]  Fernando Pereira,et al.  Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[11]  Junjie Yao,et al.  Keyword Query Reformulation on Structured Data , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[12]  Yannis Papakonstantinou,et al.  Efficient keyword search for smallest LCAs in XML databases , 2005, SIGMOD '05.

[13]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.