论文信息 - N-layer Approach to Web Information Retrieval

N-layer Approach to Web Information Retrieval

In web information retrieval, the terms or keywords are used for indexing purpose of document. These terms or keywords appear in special location such as title, subtitle, header, hyperlinks and so on. Vector space model ignores the importance of these terms with respect to their position while calculating the weight of the indexing terms. The effectiveness of the vector space model crucially depends on the weights applied to the terms of the document vectors. These weights are found using a term weight evaluation scheme based on the frequency of the terms in the document and the collection. Terms that occur more often in a document are treated as more important whereas terms that occur less frequently throughout a collection are given a higher weight. In N-level Vector space approach, the importance of these terms with respect to their position is considered. The web document is logically divided in N-layer considering the structure of web document and weights are assigned to terms based on their presence in different layer within the document. Different weight evaluation schemes proposed for vector space models are applied to N-level vector space model and are compared. N-layer vector space model gives better result as compare to vector space model. Cosine similarity and all six weight evaluation methods that are formed using different local weights and global weights show that average precision and average recall in case of N-layer vector space model is always better than vector space model.

H. B. Kekre | S. S. Sane | Jayant Gadge

[1] Ronan Cummins,et al. Evolving local and global weighting schemes in information retrieval , 2006, Information Retrieval.

[2] JonesK. Sparck,et al. A probabilistic model of information retrieval , 2000 .

[3] Ashutosh Kumar Singh,et al. Web Structure Mining: Exploring Hyperlinks and Algorithms for Information Retrieval , 2010 .

[4] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[5] Srinath Srinivasa,et al. Introduction to web information retrieval: A user perspective , 2002 .

[6] Vibhu O. Mittal,et al. The Happy Searcher: Challenges in Web Information Retrieval , 2004, PRICAI.

[7] O'RiordanColm,et al. Evolving local and global weighting schemes in information retrieval , 2006 .

[8] K. Baker,et al. Singular Value Decomposition Tutorial , 2013 .

[9] Ricardo A. Baeza-Yates,et al. Information retrieval in the Web: beyond current search engines , 2003, Int. J. Approx. Reason..

[10] Joon Ho Lee,et al. Properties of extended Boolean models in information retrieval , 1994, SIGIR '94.

[11] K. Sparck Jones,et al. A Probabilistic Model of Information Retrieval : Development and Status , 1998 .

[12] ChengXiang Zhai,et al. Statistical Language Models for Information Retrieval: A Critical Review , 2008, Found. Trends Inf. Retr..

[13] Gerard Salton,et al. Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..