Feasible settings for the adaptive latent semantic analysis: Hk-LSA model

Recently, improvements of latent semantic analysis or LSA which stems from singular value decomposition to derive latent semantic classes, especially hk-LSA model, have been proposed. The hk-LSA model is based on reducing dimension of vector space and like-probabilistic relationship between document-term and latent-topic space. This improved model overcomes some shortcomings of standard LSA such as processing very dense and orthogonal matrices and difficulties in parallelization. It is dealt with this paper, some feasible ways to setup such a model and statistical comparisons between proposed ways to recognize good setup feasible for the hk-LSA model. Case studies on this subject suggest some ways to setup hk-LSA and show relationships between the standard LSA and hk-LSA model.

[1]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[2]  Hong Tuyet Tu,et al.  Locality mutual clustering for document retrieval , 2014, ICUIMC '14.

[3]  Charles Elkan,et al.  Latent semantic indexing (LSI) fails for TREC collections , 2011, SKDD.

[4]  Khu P. Nguyen,et al.  An adaptive Latent Semantic Analysis for text mining , 2017, 2017 International Conference on System Science and Engineering (ICSSE).

[5]  Yue Lu,et al.  Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA , 2011, Information Retrieval.

[6]  Quan Wang,et al.  Regularized Latent Semantic Indexing: A New Approach to Large-Scale Topic Modeling , 2013, TOIS.

[7]  Hong Tuyet Tu,et al.  Kernel-based similarity and discovering documents of similar interests , 2017, 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD).

[8]  Eric C. Chi,et al.  Splitting Methods for Convex Clustering , 2013, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[9]  Cheng-Hao Deng,et al.  Fast k-Means Based on k-NN Graph , 2017, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[10]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[11]  K. Lange,et al.  Coordinate descent algorithms for lasso penalized regression , 2008, 0803.3876.

[12]  Michael W. Berry,et al.  Mathematical Foundations Behind Latent Semantic Analysis , 2007 .

[13]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14]  Ayman Farahat,et al.  Improving Probabilistic Latent Semantic Analysis with Principal Component Analysis , 2006, EACL.

[15]  Xi Chen,et al.  Sparse Latent Semantic Analysis , 2011, SDM.