Using a high-dimensional graph of semantic space to model relationships among words

The GOLD model (Graph Of Language Distribution) is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA). The superior performance of the GOLD models (big and small) suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition (SVD).

[1]  Petr Sojka,et al.  Software Framework for Topic Modelling , 2010 .

[2]  J. M. Kittross The measurement of meaning , 1959 .

[3]  Joshua B. Tenenbaum,et al.  The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth , 2001, Cogn. Sci..

[4]  Allan Collins,et al.  A spreading-activation theory of semantic processing , 1975 .

[5]  Jaan Einasto,et al.  Large scale structure , 2000, astro-ph/0011332.

[6]  R. Schvaneveldt,et al.  Facilitation in recognizing pairs of words: evidence of a dependence between retrieval operations. , 1971, Journal of experimental psychology.

[7]  M. Grossman,et al.  Lexical semantic and associative priming in Alzheimer's disease. , 1998, Neuropsychology.

[8]  Stephen E. Robertson,et al.  Understanding inverse document frequency: on theoretical arguments for IDF , 2004, J. Documentation.

[9]  Richard M. Shiffrin,et al.  Word Association Spaces for Predicting Semantic Similarity Effects in Episodic Memory. , 2005 .

[10]  D. Plaut,et al.  Individual and developmental differences in semantic priming: empirical and computational support for a single-mechanism account of lexical processing. , 2000, Psychological review.

[11]  R B Friedman,et al.  Lexical but not semantic priming in Alzheimer's disease. , 1991, Psychology and aging.

[12]  P. Kwantes Using context to build semantics , 2005, Psychonomic bulletin & review.

[13]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .

[14]  M R Quillian,et al.  Word concepts: a theory and simulation of some basic semantic capabilities. , 1967, Behavioral science.

[15]  M. Lucas,et al.  Semantic priming without association: A meta-analytic review , 2000, Psychonomic bulletin & review.

[16]  Martin Mozina,et al.  Orange: data mining toolbox in python , 2013, J. Mach. Learn. Res..

[17]  P. Finn Word Frequency, Information Theory, and Cloze Performance: A Transfer Feature Theory of Processing in Reading. , 1977 .

[18]  Keith A Hutchison,et al.  Is semantic priming due to association strength or feature overlap? A microanalytic review , 2003, Psychonomic bulletin & review.

[19]  J. Deese Form class and the determinants of association , 1962 .

[20]  G. Hodgson Nature and scope , 2004 .

[21]  Timothy P. McNamara,et al.  Theories of priming. I : associative distance and lag , 1992 .

[22]  Nees Jan van Eck,et al.  How to normalize cooccurrence data? An analysis of some well-known similarity measures , 2009, J. Assoc. Inf. Sci. Technol..

[23]  Mark Steyvers,et al.  Topics in semantic representation. , 2007, Psychological review.

[24]  Ludo Waltman,et al.  How to Normalize Co-Occurrence Data? An Analysis of Some Well-Known Similarity Measures , 2009, J. Assoc. Inf. Sci. Technol..

[25]  Curt Burgess,et al.  Explorations in context space: Words, sentences, discourse , 1998 .

[26]  Lawrence W. Barsalou,et al.  The instability of graded structure: implications for the nature of concepts , 1987 .

[27]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[28]  Mark S. Seidenberg,et al.  On the nature and scope of featural representations of word meaning. , 1997, Journal of experimental psychology. General.

[29]  M. Maby,et al.  Vocabulary Learning and Instruction , 2019 .

[30]  Charles Audet,et al.  Using a High-dimensional Memory Model to Evaluate the Properties of Abstract and Concrete Words , 1999 .

[31]  J. Shelton,et al.  How semantic is automatic semantic priming? , 1992, Journal of experimental psychology. Learning, memory, and cognition.

[32]  Peter Blouw,et al.  A Neurally Plausible Encoding of Word Order Information into a Semantic Vector Space , 2013, CogSci.

[33]  Elinore Kress Schatz,et al.  Context Clues Are Unreliable Predictors of Word Meanings. , 1986 .

[34]  Mark S. Seidenberg,et al.  Pre- and postlexical loci of contextual effects on word recognition , 1984, Memory & cognition.

[35]  Curt Burgess,et al.  Modelling Parsing Constraints with High-dimensional Context Space , 1997 .

[36]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[37]  Diana Inkpen,et al.  Second Order Co-occurrence PMI for Determining the Semantic Similarity of Words , 2006, LREC.

[38]  Kishore Papineni,et al.  Why Inverse Document Frequency? , 2001, NAACL.

[39]  Kevyn Collins-Thompson,et al.  Automatic and Human Scoring of Word Definition Responses , 2007, HLT-NAACL.

[40]  Sergey G. Rubin,et al.  The large-scale structure , 2012 .

[41]  John A Bullinaria,et al.  Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD , 2012, Behavior Research Methods.

[42]  Charles A. Perfetti,et al.  Context Variation and Definitions in Learning the Meanings of Words: An Instance-Based Learning Approach , 2008 .

[43]  Michael F. Graves,et al.  Growth of reading vocabulary in diverse elementary schools: Decoding and word meaning. , 1990 .

[44]  Sandra Mollin,et al.  Combining corpus linguistic and psychological data on word co-occurrences: Corpus collocates versus word associations , 2009 .

[45]  Suzanne Stevenson,et al.  A Graph-Theoretic Framework for Semantic Distance , 2010, CL.

[46]  J. A. Fodor,et al.  Against definitions , 1980, Cognition.

[47]  Lawrence W. Barsalou,et al.  Language and simulation in conceptual processing , 2008 .

[48]  S. Davis,et al.  Structure and Process , 2003 .

[49]  Thad Hughes,et al.  Lexical Semantic Relatedness with Random Graph Walks , 2007, EMNLP.

[50]  Allan Collins,et al.  A spreading-activation theory of semantic processing , 1975 .

[51]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[52]  Walter Kintsch,et al.  The Construction of Meaning , 2011, Top. Cogn. Sci..

[53]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[54]  Thomas A. Schreiber,et al.  The University of South Florida free association, rhyme, and word fragment norms , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[55]  Michael F. Graves Chapter 2: Vocabulary Learning and Instruction , 1986 .

[56]  James L. McClelland,et al.  Formal Approaches in Categorization: Semantics without categorization , 2011 .

[57]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[58]  Peter W. Foltz,et al.  Learning Human-like Knowledge by Singular Value Decomposition: A Progress Report , 1997, NIPS.

[59]  Manuel Perea,et al.  Associative and semantic priming effects occur at very short stimulus-onset asynchronies in lexical decision and naming , 1997, Cognition.

[60]  C. Burgess,et al.  Semantic and associative priming in the cerebral hemispheres: Some words do, some words don't … sometimes, some places , 1990, Brain and Language.

[61]  Walter Kintsch,et al.  Toward a model of text comprehension and production. , 1978 .

[62]  David C. Plaut,et al.  Semantic and Associative Priming in a Distributed Attractor Network , 1995 .

[63]  D. Palermo Norms of Word Association. , 1971 .

[64]  Lance J. Rips,et al.  Structure and process in semantic memory: A featural model for semantic decisions. , 1974 .

[65]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[66]  Philipp Slusallek,et al.  Introduction to real-time ray tracing , 2005, SIGGRAPH Courses.

[67]  Erkki Sutinen,et al.  Automatic Essay Grading with Probabilistic Latent Semantic Analysis , 2005 .

[68]  E. Rosch,et al.  Categorization of Natural Objects , 1981 .

[69]  M. Ross Quillian,et al.  Retrieval time from semantic memory , 1969 .

[70]  Ira Fischler,et al.  Associative facilitation without expectancy in a lexical decision task. , 1977 .

[71]  J. H. Neely Semantic priming effects in visual word recognition: A selective review of current findings and theories. , 1991 .

[72]  I. Fischler Semantic facilitation without association in a lexical decision task , 1977, Memory & cognition.

[73]  J. Fodor The Modularity of mind. An essay on faculty psychology , 1986 .

[74]  Curt Burgess,et al.  The Dynamics of Meaning in Memory , 1998 .

[75]  Peter M. Todd,et al.  Learning and connectionist representations , 1993 .

[76]  Dominic Widdows,et al.  A Graph Model for Unsupervised Lexical Acquisition , 2002, COLING.

[77]  Michael N Jones,et al.  Representing word meaning and order information in a composite holographic lexicon. , 2007, Psychological review.

[78]  E. Rosch Cognitive Representations of Semantic Categories. , 1975 .

[79]  Allan Collins,et al.  Experiments on semantic memory and language comprehension. , 1972 .

[80]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[81]  J. Bullinaria,et al.  Extracting semantic representations from word co-occurrence statistics: A computational study , 2007, Behavior research methods.

[82]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[83]  Lance W Hahn,et al.  Entropy, semantic relatedness and proximity , 2011, Behavior research methods.

[84]  R. Ratcliff,et al.  Spreading activation versus compound cue accounts of priming: mediated priming revisited. , 1992, Journal of experimental psychology. Learning, memory, and cognition.

[85]  Richard C. Anderson,et al.  How Many Words are There in Printed School English , 1984 .

[86]  J. Fodor The Modularity of mind. An essay on faculty psychology , 1986 .

[87]  Rosemarie Velik,et al.  Discrete Fourier Transform Computation Using Neural Networks , 2008, 2008 International Conference on Computational Intelligence and Security.