Gallito 2 . 0 : a Natural Language Processing tool to support Research on Discourse

Gallito 2.0. is a tool designed to allow both production of and experimentation with vector space models based on Latent Semantic Analysis (LSA). There is a freely (but time-limited) version available (http://www.elsemantico.es/gallito20/index-eng.html). The tool supports creation and evaluation of semantic spaces generated from middlescale to huge corpora (notice that Gallito 2.0 uses sparse matrices), as well as several related tasks, such as the extraction of term entropy indices, familiarity measured through vector length, similarity between terms, lists of semantic neighbors, K-means cluster, interpretation of topics, Change of Basis, Gram-Schmitdt re-ortogonalization, Construction-Integration representations, textual coherence, essay evaluation, etc. The present poster shows some uses of the tool using a small-scale corpus taken from several newspapers.