Ranking of web documents using semantic similarity

In recent years, semantic search for relevant documents on web has been an important topic of research. Many semantic web search engines have been developed like Ontolook, Swoogle, etc that helps in searching meaningful documents presented on semantic web. The concept of semantic similarity has been widely used in many fields like artificial intelligence, cognitive science, natural language processing, psychology. To relate entities/texts/documents having same meaning, semantic similarity approach is used based on matching of the keywords which are extracted from the documents using syntactic parsing. The simple lexical matching usually used by semantic search engine does not extract web documents to the user expectations. In this paper we have proposed a ranking scheme for the semantic web documents by finding the semantic similarity between the documents and the query which is specified by the user. The novel approach proposed in this paper not only relies on the syntactic structure of the document but also considers the semantic structure of the document and the query. The approach used here includes the lexical as well as the conceptual matching. The combined use of conceptual, linguistic and ontology based matching has significantly improved the performance of the proposed ranking scheme. We explore all relevant relations between the keywords exploring the user's intention and then calculate the fraction of these relations on each web page to determine their relevance with respect to the query provided by the user. We have found that this semantic similarity based ranking scheme gives much better results than those by the prevailing methods.

[1]  Pablo Castells,et al.  An Ontology-Based Information Retrieval Model , 2005, ESWC.

[2]  Azadeh Nematzadeh,et al.  ORank: An Ontology Based System for Ranking Documents , 2008 .

[3]  Justin Zobel,et al.  Clustering near-duplicate images in large collections , 2007, MIR '07.

[4]  Li Ding,et al.  Using Ontologies in the Semantic Web: A Survey , 2005, Ontologies.

[5]  Shlomo Moran,et al.  The stochastic approach for link-structure analysis (SALSA) and the TKC effect , 2000, Comput. Networks.

[6]  Siu Cheung Hui,et al.  Automatic fuzzy ontology generation for semantic Web , 2006, IEEE Transactions on Knowledge and Data Engineering.

[7]  Sudhir Singh,et al.  Improving Web Image Search Re-Ranking Using Hybrid Approach , 2014 .

[8]  Euripides G. M. Petrakis,et al.  Semantic similarity methods in wordNet and their application to information retrieval on the web , 2005, WIDM '05.

[9]  Danushka Bollegala,et al.  A Relational Model of Semantic Similarity between Words using Automatically Extracted Lexical Pattern Clusters from the Web , 2009, EMNLP.

[10]  Frédéric Jurie,et al.  Improving web image search results using query-relative classifiers , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[12]  Wei Liu,et al.  Noise resistant graph ranking for improved web image search , 2011, CVPR 2011.

[13]  Yong Yu,et al.  Conceptual Graph Matching for Semantic Search , 2002, ICCS.

[14]  Xiaogang Wang,et al.  Query-specific visual semantic spaces for web image re-ranking , 2011, CVPR 2011.

[15]  Xiaogang Wang,et al.  Visual Semantic Complex Network for Web Images , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  Tao Mei,et al.  Image search results refinement via outlier detection using deep contexts , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Michael Isard,et al.  Descriptor Learning for Efficient Retrieval , 2010, ECCV.

[18]  Danushka Bollegala,et al.  A Web Search Engine-Based Approach to Measure Semantic Similarity between Words , 2011, IEEE Transactions on Knowledge and Data Engineering.

[19]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.

[20]  Georgina Cosma,et al.  An Approach to Source-Code Plagiarism Detection and Investigation Using Latent Semantic Analysis , 2012, IEEE Transactions on Computers.

[21]  Hugo Zaragoza,et al.  Information Retrieval: Algorithms and Heuristics , 2002, Information Retrieval.

[22]  Changhu Wang,et al.  Spatial-bag-of-features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Vladimir A. Oleshchuk,et al.  Ontology based semantic similarity comparison of documents , 2003, 14th International Workshop on Database and Expert Systems Applications, 2003. Proceedings..

[24]  Karthik Ramani,et al.  Ontology-based design information extraction and retrieval , 2007, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[25]  Xian-Sheng Hua,et al.  Bayesian Visual Reranking , 2011, IEEE Transactions on Multimedia.

[26]  Hsuan-Tien Lin,et al.  Unsupervised Semantic Feature Discovery for Image Object Retrieval and Tag Refinement , 2012, IEEE Transactions on Multimedia.

[27]  Jingdong Wang,et al.  Robust visual reranking via sparsity and ranking constraints , 2011, ACM Multimedia.

[28]  Protiti Majumdar Semantic Web : The Future of WWW , 2007 .

[29]  Ivor W. Tsang,et al.  Tag-based web photo retrieval improved by batch mode re-tagging , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Alexandros Potamianos,et al.  Unsupervised Semantic Similarity Computation between Terms Using Web Documents , 2010, IEEE Transactions on Knowledge and Data Engineering.

[31]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[32]  Vidit Jain,et al.  Learning to re-rank: query-dependent image re-ranking using click data , 2011, WWW.

[33]  Ed Greengrass,et al.  Information Retrieval: A Survey , 2000 .

[34]  Xian-Sheng Hua,et al.  MSRA-MM: Bridging Research and Industrial Societies for Multimedia Information Retrieval , 2009 .

[35]  Nobuaki Minematsu,et al.  A Theory of Phase Singularities for Image Representation and its Applications to Object Tracking and Image Matching , 2009, IEEE Transactions on Image Processing.

[36]  Sayali Baxi,et al.  Re-ranking of Images using Semantic Signatures with Duplicate Images Removal & K-means clustering , 2014 .

[37]  Weisi Lin,et al.  Integrating visual saliency and consistency for re-ranking image search results , 2011, 2010 IEEE International Conference on Image Processing.

[38]  Xian-Sheng Hua,et al.  Content-aware Ranking for visual search , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.