论文信息 - Syntactic Information Retrieval

Syntactic Information Retrieval

Natural language processing (NLP) techniques are believed to have the potential to aid information retrieval (IR) in terms of retrieval accuracy. In this paper we report a proof of concept study on a new approach to NLP-based IR that we propose. Documents and queries are represented as syntactic parse trees, which are generated by a natural language parser. Based on this tree structured representation of documents and queries, the matching between a document and a query is executed on their tree representations, with tree comparison as the key operation. An IR experiment is designed to test if this approach is feasible. Experimental results show that this approach is promising and has the potential to outperform the standard bag of words approach to information retrieval, especially in response to long queries.

Shengli Wu | Jun Liu | Sally I. McClean | Chang Liu | Hui Wang

[1] Avi Arampatzis,et al. Phase-based information retrieval 1 A previous version of this work was presented as a paper at RIAO , 1998 .

[2] Philip Bille,et al. A survey on tree edit distance and related problems , 2005, Theor. Comput. Sci..

[3] Philip N. Klein,et al. Computing the Edit-Distance between Unrooted Ordered Trees , 1998, ESA.