Semantic Searching and Ranking of Documents using Hybrid Learning System and WordNet

Semantic searching seeks to improve search accuracy of the search engine by understanding searcher’s intent and the contextual meaning of the terms present in the query to retrieve more relevant results. To find out the semantic similarity between the query terms, WordNet is used as the underlying reference database. Various approaches of Learning to Rank are compared. A new hybrid learning system is introduced which combines learning using Neural Network and Support Vector Machine. As the size of the training set highly affects the performance of the Neural Network, we have used Support Vector Machine to reduce the size of the data set by extracting support vectors that are critical for the learning. The data set containing support vectors is then used for learning a ranking function using Neural Network. The proposed system is compared with RankNet. The experimental results demonstrated very promising performance improvements. For experiments, we have used English-Hindi parallel corpus, Gyannidhi from CDAC. F-measure and Average Interpolated Precision are used for evaluation.

[1]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[2]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[3]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[4]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[5]  Tie-Yan Liu,et al.  Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[6]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[7]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[8]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[9]  Tie-Yan Liu,et al.  Adapting ranking SVM to document retrieval , 2006, SIGIR.

[10]  Edward F. Harrington,et al.  Online Ranking/Collaborative Filtering Using the Perceptron Algorithm , 2003, ICML.

[11]  Yoram Singer,et al.  Log-Linear Models for Label Ranking , 2003, NIPS.

[12]  Massimiliano Pontil,et al.  Support Vector Machines: Theory and Applications , 2001, Machine Learning and Its Applications.

[13]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[14]  Tao Qin,et al.  FRank: a ranking method with fidelity loss , 2007, SIGIR.

[15]  Tom M. Mitchell,et al.  Using the Future to Sort Out the Present: Rankprop and Multitask Learning for Medical Risk Evaluation , 1995, NIPS.

[16]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[17]  Tao Qin,et al.  Learning to Search Web Pages with Query-Level Loss Functions , 2006 .

[18]  Hsuan-Tien Lin,et al.  An Ensemble Ranking Solution for the Yahoo ! Learning to Rank Challenge , 2010 .

[19]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[20]  Simon Haykin,et al.  Neural Networks and Learning Machines , 2010 .

[21]  Lipo Wang Support vector machines : theory and applications , 2005 .

[22]  Koby Crammer,et al.  Pranking with Ranking , 2001, NIPS.

[23]  Mirella Lapata,et al.  Automatic Evaluation of Information Ordering: Kendall’s Tau , 2006, CL.