Exploration of term relationship for Bayesian network based sentence retrieval

Sentence retrieval is to retrieve query-relevant sentences in response to user query. However, limited information contained in sentence always incurs a lot of uncertainties, which heavily influence the retrieval performance. To solve this problem, Bayesian network, which has been accepted as one of the most promising methodologies to deal with information uncertainty, is explored. Correspondingly, three sentence retrieval models based on Bayesian network are proposed, i.e. BNSR model, BNSR_TR model and BNSR_CR model. BNSR model assumes independency between terms and shows certain improvement in retrieval performance. BNSR_TR and BNSR_CR models relax the assumption of term independency but consider term relationships from two different points of view, namely term and term context. Experiments verify the performance improvements produced by these two models, but BNSR_CR shows more advantages than BNSR_TR model, because of its more accurate identification of term dependency.

[1]  James Allan,et al.  Automatic structuring and retrieval of large text files , 1994, CACM.

[2]  Shichao Zhang,et al.  Association Rule Mining: Models and Algorithms , 2002 .

[3]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[4]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[5]  Barry Schiffman Experiments in Novelty Detection at Columbia University , 2002, TREC.

[6]  Ryen W. White,et al.  Using top-ranking sentences to facilitate effective information access: Book Reviews , 2005 .

[7]  W. Bruce Croft,et al.  Novelty detection based on sentence level patterns , 2005, CIKM '05.

[8]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[9]  W. Bruce Croft,et al.  A Translation Model for Sentence Retrieval , 2005, HLT.

[10]  James Allan,et al.  UMass at TREC 2002: Cross Language and Novelty Tracks , 2002, TREC.

[11]  Daniel Marcu,et al.  Bayesian Query-Focused Summarization , 2006, ACL.

[12]  Donald J. Berndt,et al.  Finding Patterns in Time Series: A Dynamic Programming Approach , 1996, Advances in Knowledge Discovery and Data Mining.

[13]  Ryen W. White,et al.  Using top-ranking sentences to facilitate effective information access , 2005, J. Assoc. Inf. Sci. Technol..

[14]  Tomohiro Takagi,et al.  Meiji University Web and Novelty Track Experiments at TREC 2003 , 2003, Text Retrieval Conference.

[15]  Kevyn Collins-Thompson,et al.  Information Filtering, Novelty Detection, and Named-Page Finding , 2002, TREC.

[16]  Luis M. de Campos,et al.  Clustering terms in the Bayesian network retrieval model: a new approach with two term-layers , 2004, Appl. Soft Comput..

[17]  Jian-Yun Nie,et al.  Query expansion using term relationships in language models for information retrieval , 2005, CIKM '05.

[18]  Ulrich Güntzer,et al.  Algorithms for association rule mining — a general survey and comparison , 2000, SKDD.

[19]  Xiaoyan Li,et al.  Syntactic features in question answering , 2003, SIGIR.

[20]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.