Using Statistical Models of Morphology in the Search for Optimal Units of Representation in the Human Mental Lexicon

Determining optimal units of representing morphologically complex words in the mental lexicon is a central question in psycholinguistics. Here, we utilize advances in computational sciences to study human morphological processing using statistical models of morphology, particularly the unsupervised Morfessor model that works on the principle of optimization. The aim was to see what kind of model structure corresponds best to human word recognition costs for multimorphemic Finnish nouns: a model incorporating units resembling linguistically defined morphemes, a whole-word model, or a model that seeks for an optimal balance between these two extremes. Our results showed that human word recognition was predicted best by a combination of two models: a model that decomposes words at some morpheme boundaries while keeping others unsegmented and a whole-word model. The results support dual-route models that assume that both decomposed and full-form representations are utilized to optimally process complex words within the mental lexicon.

[1]  J. Grainger,et al.  Masked cross-modal morphological priming: Unravelling morpho-orthographic and morpho-semantic influences in early word recognition , 2005 .

[2]  Stefan L. Frank,et al.  Uncertainty Reduction as a Measure of Cognitive Load in Sentence Comprehension , 2013, Top. Cogn. Sci..

[3]  K. Hugdahl,et al.  Neural correlates of morphological decomposition in a morphologically rich language: An fMRI study , 2006, Brain and Language.

[4]  Matthew H. Davis,et al.  Morphological decomposition based on the analysis of orthography , 2008 .

[5]  L. Feldman Modeling Morphological Processing , 2013 .

[6]  John A. Goldsmith,et al.  Unsupervised Learning of the Morphology of a Natural Language , 2001, CL.

[7]  Oskar Kohonen,et al.  Semi-Supervised Learning of Concatenative Morphology , 2010, SIGMORPHON.

[8]  Robert Schreuder,et al.  Constraining psycholinguistic models of morphological processing and representation: The role of productivity , 1992 .

[9]  R. Baayen,et al.  Morphological influences on the recognition of monosyllabic monomorphemic words , 2006 .

[10]  Julian M. Pine,et al.  Constructing a Language: A Usage-Based Theory of Language Acquisition. , 2004 .

[11]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[12]  A. Rodríguez-Fornells,et al.  Recognition of morphologically complex words in Finnish: Evidence from event-related potentials , 2007, Brain Research.

[13]  Stephen T. Wu,et al.  Complexity Metrics in an Incremental Right-Corner Parser , 2010, ACL.

[14]  R. O’Reilly Six principles for biologically based computational models of cortical cognition , 1998, Trends in Cognitive Sciences.

[15]  Jorma Rissanen,et al.  Stochastic Complexity in Statistical Inquiry , 1989, World Scientific Series in Computer Science.

[16]  Hermann Ney,et al.  On structuring probabilistic dependences in stochastic language modelling , 1994, Comput. Speech Lang..

[17]  M. Laine,et al.  How word frequency affects morphological processing in monolinguals and bilinguals. , 2003 .

[18]  Mathias Creutz,et al.  Unsupervised models for morpheme segmentation and morphology learning , 2007, TSLP.

[19]  Oskar Kohonen,et al.  Towards Unsupervised Learning of Constructions From Text , 2009 .

[20]  Nick Chater,et al.  The Logical Problem of Language Acquisition: A Probabilistic Perspective , 2010, Cogn. Sci..

[21]  Robert Schreuder,et al.  How Complex Simplex Words can be , 1997 .

[22]  John Hale,et al.  A Probabilistic Earley Parser as a Psycholinguistic Model , 2001, NAACL.

[23]  Ebru Arisoy,et al.  Morph-based speech recognition and modeling of out-of-vocabulary words across languages , 2007, TSLP.

[24]  M. Taft Recognition of affixed words and the word frequency effect , 1979, Memory & cognition.

[25]  Dušica Filipović Đurđević,et al.  An amorphous model for morphological processing in visual comprehension based on naive discriminative learning. , 2011, Psychological review.

[26]  Riitta Salmelin,et al.  Neural dynamics of reading morphologically complex words , 2009, NeuroImage.

[27]  K. Forster,et al.  Lexical storage and retrieval of prefixed words , 1975 .

[28]  S. Frank,et al.  Insensitivity of the Human Sentence-Processing System to Hierarchical Structure , 2011, Psychological science.

[29]  Fermín Moscoso del Prado Martín,et al.  The simultaneous effects of inflectional paradigms and classes on lexical recognition: Evidence from Serbian , 2009 .

[30]  Zellig S. Harris,et al.  From Phoneme to Morpheme , 1955 .

[31]  R. Harald Baayen,et al.  Comprehension without segmentation: a proof of concept with naive discriminative learning , 2016 .

[32]  Oskar Kohonen,et al.  Evaluating the effect of word frequencies in a probabilistic generative model of morphology , 2011, NODALIDA.

[33]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[34]  Lars Borin,et al.  What is a lexical representation? , 1985, NODALIDA.

[35]  Dennis Norris,et al.  The Bayesian reader: explaining word recognition as an optimal Bayesian decision process. , 2006, Psychological review.

[36]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[37]  Alessandro Laudanna,et al.  Processing in?ectional and derivational morphology , 1992 .

[38]  R. Baayen,et al.  Mixed-effects modeling with crossed random effects for subjects and items , 2008 .

[39]  Roger Levy,et al.  Sequential vs. Hierarchical Syntactic Models of Human Incremental Sentence Processing , 2012, CMCL@NAACL-HLT.

[40]  Robert Rescorla Rescorla-Wagner model , 2008, Scholarpedia.

[41]  R. Levy Expectation-based syntactic comprehension , 2008, Cognition.

[42]  R. Baayen,et al.  Reading polymorphemic Dutch compounds: toward a multiple route model of lexical processing. , 2009, Journal of experimental psychology. Human perception and performance.

[43]  R. Baayen,et al.  Paradigms bit by bit : an information-theoretic approach to the processing of paradigmatic structure in inflection and derivation , 2008 .

[44]  Mikko Kurimo,et al.  Unlimited vocabulary speech recognition with morph language models applied to Finnish , 2006, Comput. Speech Lang..

[45]  R. Baayen,et al.  Shifting paradigms: gradient structure in morphology , 2005, Trends in Cognitive Sciences.

[46]  Teemu Hirsimäki,et al.  On Growing and Pruning Kneser–Ney Smoothed $ N$-Gram Models , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[47]  M. Taft Morphological Decomposition and the Reverse Base Frequency Effect , 2004, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[48]  Robert Schreuder,et al.  Effects of Family Size for Complex Words , 2000 .

[49]  Christian Biemann,et al.  Corpus Portal for Search in Monolingual Corpora , 2006, LREC.

[50]  Laura Winther Balling,et al.  Probability and surprisal in auditory comprehension of morphologically complex words , 2012, Cognition.

[51]  Jay G Rueckl,et al.  Connectionism and the Role of Morphology in Visual Word Recognition. , 2010, The mental lexicon.

[52]  P. Gordon,et al.  Frequency Effects and the Representational Status of Regular Inflections , 1999 .

[53]  Charles D. Yang Universal Grammar, statistics or both? , 2004, Trends in Cognitive Sciences.

[54]  Jeffrey Lidz,et al.  How Nature Meets Nurture: Universal Grammar and Statistical Learning , 2015 .

[55]  B. Velichkovsky,et al.  Eye typing in application: A comparison of two interfacing systems with ALS patients , 2008 .

[56]  Mathias Creutz,et al.  Unsupervised Morpheme Segmentation and Morphology Induction from Text Corpora Using Morfessor 1.0 , 2005 .

[57]  William D. Marslen-Wilson,et al.  Neurocognitive Contexts for Morphological Complexity: Dissociating Inflection and Derivation , 2010, Lang. Linguistics Compass.

[58]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[59]  Reinhold Kliegl,et al.  Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus , 2008, Journal of Eye Movement Research.

[60]  Mikko Kurimo,et al.  Morfessor 2.0: Python Implementation and Extensions for Morfessor Baseline , 2013 .

[61]  Nathaniel J. Smith,et al.  The effect of word predictability on reading time is logarithmic , 2013, Cognition.

[62]  R. Baayen,et al.  Putting the bits together: an information theoretical perspective on morphological processing , 2004, Cognition.

[63]  Risto Miikkulainen,et al.  Impairment and Rehabilitation in Bilingual Aphasia: A SOM-Based Model , 2011, WSOM.

[64]  Jörg Tiedemann,et al.  News from OPUS — A collection of multilingual parallel corpora with tools and interfaces , 2009 .

[65]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[66]  Kimmo Koskenniemi,et al.  A General Computational Model for Word-Form Recognition and Production , 1984, ACL.

[67]  Mathias Creutz,et al.  INDUCING THE MORPHOLOGICAL LEXICON OF A NATURAL LANGUAGE FROM UNANNOTATED TEXT , 2005 .

[68]  Victor Kuperman,et al.  Words and paradigms bit by bit: An information‐theoretic approach to the processing of inflection and derivation , 2009 .

[69]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[70]  S. Frank,et al.  The ERP response to the amount of information conveyed by words in sentences , 2015, Brain and Language.

[71]  Nivja H. de Jong,et al.  Changing places: A cross-language perspective on frequency and family size in Dutch and Hebrew , 2005 .

[72]  Tal Linzen,et al.  The role of morphology in phoneme prediction: Evidence from MEG , 2014, Brain and Language.

[73]  Matti Laine,et al.  Cognitive morphology in finnish: Foundations of a new model , 1994 .

[74]  Elena Lieven,et al.  A Constructivist Account of Child Language Acquisition , 2015 .

[75]  Mathias Creutz,et al.  Morpheme Segmentation Gold Standards for Finnish and English , 2004 .

[76]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[77]  Randall C. O'Reilly,et al.  Generalization in Interactive Networks: The Benefits of Inhibitory Competition and Hebbian Learning , 2001, Neural Computation.

[78]  Minna Lehtonen,et al.  Word frequency and morphological processing in Finnish revisited , 2007 .

[79]  Andreas Stolcke,et al.  Web resources for language modeling in conversational speech recognition , 2007, TSLP.

[80]  R. Baayen,et al.  Morphological family size in a morphologically rich language: the case of Finnish compared with Dutch and Hebrew. , 2004, Journal of experimental psychology. Learning, memory, and cognition.

[81]  A. Kostić,et al.  Informational approach to the processing of inflected morphology: Standard data reconsidered , 1991 .

[82]  Thomas L. Griffiths,et al.  Interpolating between types and tokens by estimating power-law generators , 2005, NIPS.

[83]  Alec Marantz,et al.  Decomposition, lookup, and recombination: MEG evidence for the Full Decomposition model of complex visual word recognition , 2015, Brain and Language.

[84]  Jukka Hyönä,et al.  Effects Of A Word’s Morphological Complexity On Readers’ Eye Fixation Patterns , 1995 .

[85]  Mikko Kurimo,et al.  Morpho Challenge Evaluation Using a Linguistic Gold Standard , 2007, CLEF.

[86]  Julian M. Pine,et al.  An Elicited-Production Study of Inflectional Verb Morphology in Child Finnish , 2016, Cogn. Sci..

[87]  J. Hyönä,et al.  Lexical access routes to nouns in a morphologically rich language , 1999 .

[88]  Juhani Järvikivi,et al.  Form-Based Representation in the Mental Lexicon: Priming (with) Bound Stem Allomorphs in Finnish , 2002, Brain and Language.

[89]  Thomas L. Griffiths,et al.  Producing Power-Law Distributions and Damping Word Frequencies with Two-Stage Language Models , 2011, J. Mach. Learn. Res..

[90]  Stefan L. Frank,et al.  Surprisal-based comparison between a symbolic and a connectionist model of sentence processing , 2009 .

[91]  E. Newport,et al.  PSYCHOLOGICAL SCIENCE Research Article INCIDENTAL LANGUAGE LEARNING: Ustening (and Learning) out of the Comer of Your Ear , 2022 .

[92]  Victor Kuperman,et al.  Processing trade-offs in the reading of Dutch derived words , 2010 .

[93]  Hans Van Halteren,et al.  Author verification by linguistic profiling: An exploration of the parameter space , 2007, TSLP.

[94]  R. Baayen,et al.  Singulars and plurals in Dutch: Evidence for a parallel dual-route model , 1997 .

[95]  Mathias Creutz,et al.  Unsupervised Discovery of Morphemes , 2002, SIGMORPHON.

[96]  D. Danks Equilibria of the Rescorla--Wagner model , 2003 .

[97]  Riitta Salmelin,et al.  Predicting Reaction Times in Word Recognition by Unsupervised Learning of Morphology , 2011, ICANN.

[98]  Stephen F. Weiss,et al.  Word segmentation by letter successor varieties , 1974, Inf. Storage Retr..