Checking and bootstrapping lexical norms by means of word similarity indexes

In psychology, lexical norms related to the semantic properties of words, such as concreteness and valence, are important research resources. Collecting such norms by asking judges to rate the words is very time consuming, which strongly limits the number of words that compose them. In the present article, we present a technique for estimating lexical norms based on the latent semantic analysis of a corpus. The analyses conducted emphasize the technique’s effectiveness for several semantic dimensions. In addition to the extension of norms, this technique can be used to check human ratings to identify words for which the rating is very different from the corpus-based estimate.

[1]  H. M. Jenkins,et al.  JUDGMENT OF CONTINGENCY BETWEEN RESPONSES AND OUTCOMES. , 1965, Psychological monographs.

[2]  D. R. Heise,et al.  Semantic di erential profiles for 1000 most frequent English words , 1965 .

[3]  Philip J. Stone,et al.  Extracting Information. (Book Reviews: The General Inquirer. A Computer Approach to Content Analysis) , 1967 .

[4]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[5]  M. R. Mickey,et al.  Estimation of Error Rates in Discriminant Analysis , 1968 .

[6]  M. Stone,et al.  Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[7]  R. Logie,et al.  Age-of-acquisition, imagery, concreteness, familiarity, and ambiguity measures for 1,944 words , 1980 .

[8]  J L Bradshaw,et al.  A guide to norms, ratings, and lists , 1984, Memory & cognition.

[9]  Judith F. Kroll,et al.  Lexical access for concrete and abstract words. , 1986 .

[10]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[11]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[12]  Yves Bestgen Can emotional valence in stories be determined from words , 1994 .

[13]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[14]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[15]  Andrew W. Ellis,et al.  Age of Acquisition Norms for a Large Set of Object Names and Their Relation to Adult Estimates and Other Variables , 1997 .

[16]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[17]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[18]  R. Proctor,et al.  Index of norms and ratings published in the Psychonomic Society journals , 1999, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[19]  M. Bradley,et al.  Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings , 1999 .

[20]  M. Erb,et al.  The Concreteness Effect: Evidence for Dual Coding and Context Availability , 2000, Brain and Language.

[21]  Jean‐Marc Dewaele,et al.  Emotion Vocabulary in Interlanguage , 2002 .

[22]  Michael L. Littman,et al.  Unsupervised Learning of Semantic Orientation from a Hundred-Billion-Word Corpus , 2002, ArXiv.

[23]  Yves Bestgen Détermination de la valence affective de termes dans de grands corpus de textes , 2002 .

[24]  J. Kamps,et al.  Words with attitude , 2002 .

[25]  Patrick Bonin,et al.  Objective age-of-acquisition (AoA) norms for a set of 230 object names in French: Relationships with psycholinguistic variables, the English data from Morrison et al. (1997), and naming latencies , 2003 .

[26]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[27]  J. Pennebaker,et al.  Psychological aspects of natural language. use: our words, our selves. , 2003, Annual review of psychology.

[28]  Christopher Barry,et al.  The influence of age of acquisition in word reading and other tasks : A never ending story ? , 2004 .

[29]  Darrell Laham,et al.  From paragraph to graph: Latent semantic analysis for information visualization , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Michael J Cortese,et al.  Visual word recognition of single-syllable words. , 2004, Journal of experimental psychology. General.

[31]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[32]  Carlo Strapparava,et al.  Developing Affective Lexical Resources , 2004, PsychNology J..

[33]  Mark Warschauer,et al.  Automated writing evaluation: defining the classroom research agenda , 2006 .

[34]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[35]  Walter Kintsch,et al.  Meaning in Context , 2007 .

[36]  Mark Steyvers,et al.  Topics in semantic representation. , 2007, Psychological review.

[37]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[38]  Michael N Jones,et al.  Representing word meaning and order information in a composite holographic lexicon. , 2007, Psychological review.

[39]  Danielle S. McNamara,et al.  Handbook of latent semantic analysis , 2007 .

[40]  Yves Bestgen,et al.  Building Affective Lexicons from Specific Corpora for Automatic Sentiment Analysis , 2008, LREC.

[41]  A. Desrochers,et al.  SOURCES DE MATÉRIEL EN FRANÇAIS POUR L’ÉLABORATION D’ÉPREUVES DE COMPÉTENCES EN LECTURE ET EN ÉCRITURE , 2008 .

[42]  Marc Brysbaert,et al.  Age-of-acquisition and subjective frequency estimates for all generally known monosyllabic French words and their relation with other psycholinguistic variables , 2008, Behavior research methods.

[43]  Gabriella Vigliocco,et al.  Integrating experiential and distributional data to learn semantic representations. , 2009, Psychological review.

[44]  Glenn L. Thompson,et al.  Corroborating biased indicators: Global and local agreement among objective and subjective estimates of printed word frequency , 2009, Behavior research methods.

[45]  A. Cohen,et al.  A laboratory-based procedure for measuring emotional expression from natural speech , 2009, Behavior research methods.

[46]  Alain Desrochers,et al.  Subjective frequency and imageability ratings for 3,600 French nouns , 2009, Behavior research methods.

[47]  Jason Tipples,et al.  Time flies when we read taboo words , 2010, Psychonomic bulletin & review.

[48]  Werner Sommer,et al.  Reading emotional words within sentences: the impact of arousal and valence on event-related potentials. , 2010, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[49]  Max M. Louwerse,et al.  Symbol Interdependency in Symbolic and Embodied Cognition , 2011, Top. Cogn. Sci..

[50]  G. Vigliocco,et al.  The representation of abstract words: why emotion matters. , 2011, Journal of experimental psychology. General.

[51]  Michael J Cortese,et al.  Do the effects of subjective frequency and age of acquisition survive better word frequency norms? , 2011, Quarterly journal of experimental psychology.