Weight functions impact on LSA performance

This paper presents experimental results of usage of LSA for analysis of English literature texts. Several preliminary transformations of the frequency text-document matrix with different weight functions are tested on the basis of control subsets. Additional clustering based on correlation matrix is applied in order to reveal the latent structure. The algorithm creates a shaded form matrix via singular values and vectors. The results are interpreted as a quality of the transformations and compared to the control set tests.

[1]  George R. Klare,et al.  The measurement of readability , 1963 .

[2]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[3]  Gerard Salton,et al.  The SMART Retrieval System , 1971 .

[4]  Susan T. Dumais,et al.  Statistical semantics: analysis of the potential performance of keyword information systems , 1984 .

[5]  D. Biber A typology of English texts , 1989 .

[6]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[7]  Donna K. Harman,et al.  How effective is suffixing? , 1991, J. Am. Soc. Inf. Sci..

[8]  Susan T. Dumais,et al.  Improving the retrieval of information from external sources , 1991 .

[9]  Susan T. Dumais,et al.  LSI meets TREC: A Status Report , 1992, TREC.

[10]  Michael W. Berry,et al.  SVDPACKC (Version 1.0) User''s Guide , 1993 .

[11]  Susan T. Dumais,et al.  Latent Semantic Indexing (LSI) and TREC-2 , 1993, TREC.

[12]  Jussi Karlgren,et al.  Recognizing Text Genres With Simple Metrics Using Discriminant Analysis , 1994, COLING.

[13]  Susan T. Dumais,et al.  Using LSI for information filtering: TREC-3 experiments , 1995 .

[14]  Robert M. Losee,et al.  Text Windows and Phrases Differing by Discipline, Location in Document, and Syntactic Structure , 1996, Inf. Process. Manag..

[15]  Jingqian Jiang,et al.  Using Latent Semantic Indexing for Data Mining , 1997 .

[16]  Michael W. Berry,et al.  Downdating the Latent Semantic Indexing Model for Conceptual Information Retrieval , 1998, Comput. J..

[17]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[18]  Preslav Nakov Chapter 15 Getting Better Results With Latent Semantic Indexing , 2000 .

[19]  Preslav Nakov,et al.  ИЗСЛЕДВАНЕ НА РУСКА ЛИТЕРАТУРА С ЛАТЕНТЕН СЕМАНТИЧЕН АНАЛИЗ Преслав И. Наков Софийски университет "Св. Климент Охридски" LATENT SEMANTIC ANALYSIS FOR RUSSIAN LITERATURE INVESTIGATION , 2001 .