Kernel-based similarity and discovering documents of similar interests

One of the continuing problems in information retrieval is searching documents of similar features. A number of methods have been developed for solving such a problem using latent topic analysis or its improvements. Anyhow, measures of similarity are crucial and play important role in finding out feasible solutions. It is dealt with paper a proposed method using diffusion kernel of term-network to set up a similarity measure and searching in a given corpus for documents that meet some specified similar features. In doing so, it is recognized some properties of similarity based on kernel in comparison with others, especially with measures of similarity based on an adaptive model of latent topic analysis named hk-LSA. Numerical experiments and statistical comparison are used to show evidently results of the proposed method.

[1]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[2]  Huan Liu,et al.  Connecting users with similar interests via tag network inference , 2011, CIKM '11.

[3]  Risi Kondor,et al.  Diffusion kernels on graphs and other discrete structures , 2002, ICML 2002.

[4]  S. Karthik,et al.  A survey on semantic similarity between words in semantic web , 2012, 2012 International Conference on Radar, Communication and Computing (ICRCC).

[5]  Khu P. Nguyen,et al.  An adaptive Latent Semantic Analysis for text mining , 2017, 2017 International Conference on System Science and Engineering (ICSSE).

[6]  Kuldip K. Paliwal,et al.  Intrusion detection using text processing techniques with a kernel based similarity measure , 2007, Comput. Secur..

[7]  Michael W. Berry,et al.  Mathematical Foundations Behind Latent Semantic Analysis , 2007 .

[8]  James W. Cooper,et al.  A novel method for detecting similar documents , 2002, Proceedings of the 35th Annual Hawaii International Conference on System Sciences.

[9]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[10]  Alexis Papadimitriou,et al.  Fast and accurate link prediction in social networking systems , 2012, J. Syst. Softw..

[11]  Irene Díaz,et al.  Measures of Semantic Similarity of Nodes in a Social Network , 2014, IPMU.

[12]  Charles Elkan,et al.  Latent semantic indexing (LSI) fails for TREC collections , 2011, SKDD.

[13]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[14]  Dong Wang,et al.  Discovering Similar Users on Twitter , 2013 .

[15]  Quan Wang,et al.  Regularized Latent Semantic Indexing: A New Approach to Large-Scale Topic Modeling , 2013, TOIS.