IMPROVING THE RELEVANCY OF DOCUMENT SEARCH USING THE MULTI-TERM ADJACENCY KEYWORD-ORDER MODEL

This paper presents an enhanced vector space model, Multi-Term Adjacency Keyword-Order Model, to improve the relevancy of search results, specifically document search. Our model is based on the concept of keyword grouping. The keyword-order relationship in the adjacency terms is taken into consideration in measuring a term’s weight. Assigning more weights to adjacency terms in a query order results in the document vector being moved closer to the query vector, and hence increases the relevancy between the two vectors and thus eventually results in documents with better relevancy being retrieved. The performance of our model is measured based on precision metrics against the performance of a classic vector space model and the performance of a Multi-Term Vector Space Model. Results show that our model performs better in retrieving more relevant results based on a particular search query compared to both the other models.

[1]  Xiaoying Tai,et al.  An information retrieval model based on vector space method by supervised learning , 2002, Inf. Process. Manag..

[2]  Louis S. Wang Relevance weighting of multi-term queries for Vector Space Model , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[3]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[4]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[5]  Nasser Yazdani,et al.  A3CRank: An adaptive ranking method based on connectivity, content and click-through data , 2010, Inf. Process. Manag..

[6]  Ilmério Reis da Silva,et al.  Dependence among terms in vector space model , 2004, Proceedings. International Database Engineering and Applications Symposium, 2004. IDEAS '04..

[7]  Edward A. Fox,et al.  Research Contributions , 2014 .

[8]  Dik Lun Lee,et al.  Document Ranking and the Vector-Space Model , 1997, IEEE Softw..

[9]  Ram Gopal Raj,et al.  A Model for Determining The Degree of Contradictions in Information , 2011 .