论文信息 - Relation Based Term Weighting Regularization

Relation Based Term Weighting Regularization

Traditional retrieval models compute term weights based on only the information related to individual terms such as TF and IDF. However, query terms are related. Intuitively, these relations could provide useful information about the importance of a term in the context of other query terms. For example, query "perl tutorial" specifies that a user look for information relevant to both perl and tutorial. Thus, a document containing both terms should have higher relevance score than the ones with only one of them. However, if the IDF value of "tutorial" is much smaller than "perl", existing retrieval models may assign the document lower score than those containing multiple occurrences of "perl". It is clear that the importance of a term should be dependent on not only collection statistics but also the relations with other query terms. In this work, we study how to utilize semantic relations among query terms to regularize term weighting. Experiment results over TREC collections show that the proposed strategy is effective to improve the retrieval performance.

Hao Wu | Hui Fang

[1] Wei Zheng,et al. Query Aspect Based Term Weighting Regularization in Information Retrieval , 2010, ECIR.

[2] F. Hartwig,et al. Exploratory Data Analysis , 2008, Using Science in Cybersecurity.

[3] Peter Ingwersen,et al. Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.

[4] W. Bruce Croft,et al. A Markov random field model for term dependencies , 2005, SIGIR '05.

[5] Ellen M. Voorhees,et al. The fourteenth text retrieval conference TREC 2005 , 2006 .

[6] W. Bruce Croft,et al. The use of phrases and structured queries in information retrieval , 1991, SIGIR '91.

[7] ChengXiang Zhai,et al. Semantic term matching in axiomatic approaches to information retrieval , 2006, SIGIR.

[8] Van Rijsbergen,et al. A theoretical basis for the use of co-occurence data in information retrieval , 1977 .

[9] Stephen E. Robertson,et al. Okapi at TREC-3 , 1994, TREC.

[10] Amit Singhal,et al. Pivoted document length normalization , 1996, SIGIR 1996.

[11] Stephen E. Robertson,et al. GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .