Latent semantic analysis.

This article reviews latent semantic analysis (LSA), a theory of meaning as well as a method for extracting that meaning from passages of text, based on statistical computations over a collection of documents. LSA as a theory of meaning defines a latent semantic space where documents and individual words are represented as vectors. LSA as a computational technique uses linear algebra to extract dimensions that represent that space. This representation enables the computation of similarity among terms and documents, categorization of terms and documents, and summarization of large collections of documents using automated procedures that mimic the way humans perform similar cognitive tasks. We present some technical details, various illustrative examples, and discuss a number of applications from linguistics, psychology, cognitive science, education, information science, and analysis of textual data in general. WIREs Cogn Sci 2013, 4:683-692. doi: 10.1002/wcs.1254 CONFLICT OF INTEREST: The author has declared no conflicts of interest for this article. For further resources related to this article, please visit the WIREs website.

[1]  Lucian L. Visinescu,et al.  Text-mining the voice of the people , 2012, Commun. ACM.

[2]  Juan C. Valle-Lisboa,et al.  The uncovering of hidden structures by Latent Semantic Analysis , 2007, Inf. Sci..

[3]  W. Kintsch Metaphor comprehension: A computational theory , 2000, Psychonomic bulletin & review.

[4]  Etienne Wenger,et al.  Communities of practice: Meaning , 1998 .

[5]  T. Goldberg,et al.  Quantifying incoherence in speech: An automated methodology and novel application to schizophrenia , 2007, Schizophrenia Research.

[6]  Arthur C. Graesser,et al.  When Are Tutorial Dialogues More Effective Than Reading? , 2007, Cogn. Sci..

[7]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[8]  Thomas K. Landauer,et al.  Word Maturity: A New Metric for Word Knowledge , 2011 .

[9]  Danielle S. McNamara,et al.  Handbook of latent semantic analysis , 2007 .

[10]  Peter W. Foltz,et al.  The Measurement of Textual Coherence with Latent Semantic Analysis. , 1998 .

[11]  Dawid Weiss,et al.  A concept-driven algorithm for clustering search results , 2005, IEEE Intelligent Systems.

[12]  Scott Dooley,et al.  Summary Street®: Computer Support for Comprehension and Writing , 2005 .

[13]  Marc W Howard,et al.  When Does Semantic Similarity Help Episodic Retrieval , 2002 .

[14]  Chris H. Q. Ding,et al.  A probabilistic model for Latent Semantic Indexing , 2005, J. Assoc. Inf. Sci. Technol..

[15]  W. Kintsch,et al.  Metaphor Comprehension: What Makes a Metaphor Difficult to Understand? , 2002 .

[16]  Mehran Sahami,et al.  Text Mining: Classification, Clustering, and Applications , 2009 .

[17]  Cherukuri Aswani Kumar,et al.  On the Performance of Latent Semantic Indexing based Information Retrieval , 2009, J. Comput. Inf. Technol..

[18]  Steve G. Romaniuk Using Intelligent Agents to Identify Missing and Exploited Children , 2000, IEEE Intell. Syst..

[19]  Peter W. Foltz,et al.  Latent semantic analysis for text-based research , 1996 .

[20]  Arthur C. Graesser,et al.  The Right Threshold Value: What Is the Right Threshold of Cosine Measure When Using Latent Semantic Analysis for Evaluating Student Answers? , 2003, Int. J. Artif. Intell. Tools.

[21]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[22]  Victor R. Prybutok,et al.  Latent Semantic Analysis: five methodological recommendations , 2012, Eur. J. Inf. Syst..

[23]  Michael J Kahana,et al.  Interpreting semantic clustering effects in free recall , 2012, Memory.

[24]  Susan T. Dumais,et al.  Data‐driven approaches to information access , 2003 .

[25]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[26]  Marta Indulska,et al.  Quantitative approaches to content analysis: identifying conceptual drift across publication outlets , 2012, Eur. J. Inf. Syst..

[27]  Johann Hofherr,et al.  Mapping the research on aquaculture. A bibliometric analysis of aquaculture literature , 2011, Scientometrics.

[28]  Joshua B. Tenenbaum,et al.  The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth , 2001, Cogn. Sci..

[29]  Richard M. Shiffrin,et al.  Word Association Spaces for Predicting Semantic Similarity Effects in Episodic Memory. , 2005 .

[30]  Akira Utsumi,et al.  Computational Exploration of Metaphor Comprehension Processes Using a Semantic Space Model , 2011, Cogn. Sci..

[31]  T. Landauer Learning and Representing Verbal Meaning , 1998 .

[32]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[33]  Arthur C. Graesser,et al.  Using Latent Semantic Analysis to Evaluate the Contributions of Students in AutoTutor , 2000, Interact. Learn. Environ..

[34]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[35]  Walter Kintsch,et al.  The Construction of Meaning , 2011, Top. Cogn. Sci..

[36]  Kai R. Larsen,et al.  9. A Mathematical Approach to Categorization and Labeling of Qualitative Data: The Latent Categorization Method , 2004 .