A Probabilistic Model of Early Argument Structure Acquisition

Developing computational algorithms that capture the complex structure of natural language is an open problem. In particular, learning the abstract properties of language only from usage data remains a challenge. In this dissertation, we present a probabilistic usage-based model of verb argument structure acquisition that can successfully learn abstract knowledge of language from instances of verb usage, and use this knowledge in various language tasks. The model demonstrates the feasibility of a usage-based account of language learning, and provides concrete explanation for the observed patterns in child language acquisition. We propose a novel representation for the general constructions of language as probabilistic associations between syntactic and semantic features of a verb usage; these associations generalize over the syntactic patterns and the fine-grained semantics of both the verb and its arguments. The probabilistic nature of argument structure constructions in the model enables it to capture both statistical effects in language learning, and adaptability in language use. The acquisition of constructions is modeled as detecting similar usages and grouping them together. We use a probabilistic measure of similarity between verb usages, and a Bayesian framework for clustering them. Language use, on the other hand, is modeled as a prediction problem: each language task is viewed as finding the best value for a missing feature in a usage, based on the available features in that same usage and the acquired knowledge of language so far. In formulating prediction, we use the same Bayesian framework as used for learning, a formulation which takes into account both the general knowledge of language (i.e., constructions) and the specific behaviour of each verb. We show through computational simulation that the behaviour of the model mirrors that of young children in some relevant aspects. The model goes through the same learning stages as children do: the conservative use of the more frequent usages for each individual verb at the beginning, followed by a phase when general patterns are grasped and applied overtly, which leads to occasional overgeneralization errors. Such errors cease to be made over time as the model processes more input. We also investigate the learnability of verb semantic roles, a critical aspect of linking the syntax and semantics of verbs. In contrary to many existing linguistic theories and computational models which assume that semantic roles are innate and fixed, we show that general conceptions of semantic roles can be learned from the semantic properties of the verb arguments in the input usages. We represent each role as a semantic profile for an argument position in a general construction, where a profile is a probability distribution over a set of semantic properties that verb arguments can take. We extend this view to model the learning and use of verb selectional preferences, a phenomenon usually viewed as separate from verb semantic roles. Our experimental results show that the model learns intuitive profiles for both semantic roles and selectional preferences. Moreover, the learned profiles are shown to be useful in various language tasks as observed in reported experimental data on human subjects, such as resolving ambiguity in language comprehension and simulating human plausibility judgements.

[1]  Lila R. Gleitman,et al.  The Role of Syntax in Verb Learning , 2019, The Handbook of Child Language.

[2]  Thomas L. Griffiths,et al.  Language Evolution by Iterated Learning With Bayesian Agents , 2007, Cogn. Sci..

[3]  Letitia R. Naigles,et al.  Learnability and Cognition: The Acquisition of Argument Structure , 1991 .

[4]  B. MacWhinney The CHILDES project: tools for analyzing talk , 1992 .

[5]  Jeffrey L. Elman,et al.  A Connectionist Simulation of the Empirical Acquisition of Grammatical Relations , 1998, Hybrid Neural Systems.

[6]  Charles J. Fillmore,et al.  THE CASE FOR CASE. , 1967 .

[7]  Philip Resnik,et al.  Selectional Preference and Sense Disambiguation , 1997 .

[8]  SHALOM LAPPIN,et al.  Machine learning theory and practice as a source of insight into universal grammar , 2007 .

[9]  M. Tomasello Constructing a Language: A Usage-Based Theory of Language Acquisition , 2003 .

[10]  James Pustejovsky,et al.  The Generative Lexicon , 1995, CL.

[11]  B. MacWhinney,et al.  The Handbook of East Asian Psycholinguistics: The competition model , 2006 .

[12]  B. MacWhinney A multiple process solution to the logical problem of language acquisition , 2004, Journal of Child Language.

[13]  L. Gleitman The Structural Sources of Verb Meanings , 2020, Sentence First, Arguments Afterward.

[14]  Matthew Saxton,et al.  Negative evidence and negative feedback: immediate effects on the grammaticality of child speech , 2000 .

[15]  M. Gareth Gaskell,et al.  A Day in the Life of a Spoken Word , 2004 .

[16]  Randy J. LaPolla,et al.  Syntax: Structure, Meaning, and Function , 1999 .

[17]  Raymond J. Mooney,et al.  Automatic Construction of Semantic Lexicons for Learning Natural Language Interfaces , 1999, AAAI/IAAI.

[18]  Jane B. Childers,et al.  The role of pronouns in young children's acquisition of the English transitive construction. , 2001, Developmental psychology.

[19]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[20]  M. Tomasello,et al.  Differential productivity in young children's use of nouns and verbs , 1997, Journal of Child Language.

[21]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[22]  Alexander Clark,et al.  Unsupervised Language Acquisition: Theory and Practice , 2002, ArXiv.

[23]  Jeanette Altarriba,et al.  Bilingual sentence processing , 2002 .

[24]  I. Good THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS , 1953 .

[25]  S Pinker,et al.  Overregularization in language acquisition. , 1992, Monographs of the Society for Research in Child Development.

[26]  Chen Yu,et al.  The Role of Embodied Intention in Early Lexical Acquisition , 2005, Cogn. Sci..

[27]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[28]  Paula Buttery,et al.  A Quantitative Evaluation of Naturalistic Models of Language Acquisition; the Efficiency of the Triggering Learning Algorithm Compared to a Categorial Grammar Learner , 2004, Workshop On Psycho-Computational Models Of Human Language Acquisition.

[29]  Alexander Clark,et al.  PAC-Learning Unambiguous NTS Languages , 2006, ICGI.

[30]  G. Lakoff,et al.  Women, Fire, and Dangerous Things: What Categories Reveal about the Mind , 1988 .

[31]  N. Chater,et al.  Simplicity: A cure for overgeneralizations in language acquisition? , 2019, Proceedings of the Twenty-Fourth Annual Conference of the Cognitive Science Society.

[32]  Hal L. Smith,et al.  A competition model , 2008 .

[33]  Eytan Ruppin,et al.  Unsupervised Context Sensitive Language Acquisition from a Large Corpus , 2003, NIPS.

[34]  E. Clark,et al.  Adult reformulations of child errors as negative evidence , 2003, Journal of Child Language.

[35]  Fernand Gobet,et al.  Modelling Syntactic Development in a Cross-Linguistic Context , 2004 .

[36]  A. Goldberg Constructions at Work: The Nature of Generalization in Language , 2006 .

[37]  Ari Rappoport,et al.  A Second Language Acquisition Model Using Example Generalization and Concept Categories , 2005 .

[38]  Matthew P Walker,et al.  A refined model of sleep and the time course of memory formation. , 2005, The Behavioral and brain sciences.

[39]  M. Tomasello The item-based nature of children’s early syntactic development , 2000, Trends in Cognitive Sciences.

[40]  P. Resnik Selectional constraints: an information-theoretic model and its computational realization , 1996, Cognition.

[41]  Willard Van Orman Quine,et al.  Word and Object , 1960 .

[42]  Robert Givan,et al.  Specific-to-general learning for temporal events , 2002, AAAI/IAAI.

[43]  Letitia R. Naigles,et al.  Children use syntax to learn verb meanings , 1990, Journal of Child Language.

[44]  Anat Ninio,et al.  Language and the Learning Curve: A New Theory of Syntactic Development , 2006 .

[45]  Katrin Erk,et al.  A Simple, Similarity-based Model for Selectional Preferences , 2007, ACL.

[46]  J. Bresnan Lexical-Functional Syntax , 2000 .

[47]  Diana McCarthy,et al.  Disambiguating Nouns, Verbs, and Adjectives Using Automatically Acquired Selectional Preferences , 2003, CL.

[48]  A. Goldberg Constructions: A Construction Grammar Approach to Argument Structure , 1995 .

[49]  Mirella Lapata,et al.  Evaluating and Combining Approaches to Selectional Preference Acquisition , 2003, EACL.

[50]  S. Pinker How could a child use verb syntax to learn verb semantics , 1994 .

[51]  Martin A Nowak,et al.  Chaos and language , 2004, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[52]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[53]  Yehuda N. Falk,et al.  Lexical-Functional Grammar: An Introduction to Parallel Constraint-Based Syntax , 2001 .

[54]  Steven Pinker,et al.  Language learnability and language development , 1985 .

[55]  Mark C. Baker,et al.  Incorporation: A Theory of Grammatical Function Changing , 1988 .

[56]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[57]  Susan M. Garnsey,et al.  Semantic Influences On Parsing: Use of Thematic Role Information in Syntactic Ambiguity Resolution , 1994 .

[58]  B. MacWhinney,et al.  Functionalism and the competition model , 1989 .

[59]  Marc Light,et al.  Statistical models for the induction and use of selectional preferences , 2002, Cogn. Sci..

[60]  Stephen Clark,et al.  Class-Based Probability Estimation Using a Semantic Hierarchy , 2002, CL.

[61]  L. Gleitman,et al.  Language and Experience: Evidence from the Blind Child , 1988 .

[62]  N. Akhtar,et al.  Acquiring basic word order: evidence for data-driven learning of syntactic structure , 1999, Journal of Child Language.

[63]  Diane C. Lillo-Martin,et al.  Blackwell Textbooks in Linguistics , 2005 .

[64]  Steven Pinker,et al.  Productivity and constraints in the acquisition of the passive , 1987, Cognition.

[65]  Alon Lavie,et al.  High-accuracy Annotation and Parsing of CHILDES Transcripts , 2007 .

[66]  Fernando C Pereira Formal grammar and information theory: together again? , 2000, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[67]  Michael Matessa,et al.  Modelling focused learning in role assignment , 2000 .

[68]  J. Pine,et al.  Lexically-based learning and early grammatical development , 1997, Journal of Child Language.

[69]  John R. Anderson,et al.  The Adaptive Nature of Human Categorization. , 1991 .

[70]  Peter Ford Dominey,et al.  A Developmental Model of Syntax Acquisition in the Construction Grammar Framework with Cross-Linguistic Validation in English and Japanese , 2004 .

[71]  Nobuo Satake,et al.  A Computational Model of First Language Acquisition , 1990, World Scientific Series in Computer Science.

[72]  Hang Li,et al.  Generalizing Case Frames Using a Thesaurus and the MDL Principle , 1995, CL.

[73]  Cynthia L Fisher,et al.  Structural limits on verb mapping: the role of abstract structure in 2.5‐year‐olds’ interpretations of novel verbs , 2002 .

[74]  Melissa Bowerman,et al.  Argument structure and learnability: Is a solution in sight? , 1996 .

[75]  Joseph P. Allen Probabilistic Constraints in Acquisition , 1997 .

[76]  M. Tomasello Do young children have adult syntactic competence? , 2000, Cognition.

[77]  Ellen Riloff,et al.  An Empirical Approach to Conceptual Case Frame Acquisition , 1998, VLC@COLING/ACL.

[78]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[79]  Sourabh Niyogi Bayesian Learning at the Syntax-Semantics Interface , 2002 .

[80]  M. Brent,et al.  The role of exposure to isolated words in early vocabulary development , 2001, Cognition.

[81]  Alexander Clark Unsupervised induction of stochastic context-free grammars using distributional clustering , 2001, CoNLL.

[82]  V. M. Holmes,et al.  Lexical Expectations in Parsing Complement-Verb Sentences , 1989 .

[83]  Robert M French,et al.  A Simple Recurrent Network Model of Bilingual Memory , 1998 .

[84]  Anna L. Theakston,et al.  The role of entrenchment in children's and adults' performance on grammaticality judgment tasks , 2004 .

[85]  David R. Dowty Thematic proto-roles and argument selection , 1991 .

[86]  Michael Tomasello,et al.  Beyond Names for Things: Young Children's Acquisition of Verbs , 1997 .

[87]  Ray Jackendoff Semantics and Cognition , 1983 .

[88]  J. Siskind A computational study of cross-situational techniques for learning word-to-meaning mappings , 1996, Cognition.

[89]  Mark S. Seidenberg,et al.  The emergence of grammaticality in connectionist networks. , 1999 .

[90]  Adele E. Goldberg,et al.  The contribution of argument structure constructions to sentence meaning , 2000 .

[91]  Noam Chomsky,et al.  Rules and representations , 1980, Behavioral and Brain Sciences.

[92]  Massimiliano Ciaramita,et al.  Explaining away ambiguity: Learning verb selectional preference with Bayesian networks , 2000, COLING.

[93]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[94]  R. Shillcock,et al.  Proceedings of the Twenty-Sixth Annual Conference of the Cognitive Science Society , 1998 .

[95]  G. Marcus Negative evidence in language acquisition , 1993, Cognition.

[96]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[97]  Ping Li,et al.  3 A self-organizing connectionist model of bilingual processing , 2002 .

[98]  Afsaneh Fazly,et al.  An Incremental Bayesian Model for Learning Syntactic Categories , 2008, CoNLL.

[99]  Daniel Jurafsky,et al.  Semantic Role Labeling by Tagging Syntactic Chunks , 2004, CoNLL.

[100]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[101]  J. Altarriba,et al.  A Self-Organizing Connectionist Model of Bilingual Processing , 2001 .

[102]  M. Bowerman Mapping thematic roles onto syntactic functions: are children helped by innate linking rules? , 1990 .

[103]  Joshua K. Hartshorne,et al.  Why girls say 'holded' more than boys. , 2006, Developmental science.

[104]  Brian MacWhinney,et al.  Basic Syntactic Processes , 1982 .

[105]  Julian M. Pine,et al.  A process model of children's early verb use , 2000 .

[106]  Martha Palmer,et al.  Class-Based Construction of a Verb Lexicon , 2000, AAAI/IAAI.

[107]  E. Kako Thematic role properties of subjects and objects , 2006, Cognition.

[108]  Kenji Sagae,et al.  Parsing of Grammatical Relations in Transcripts of Parent-Child Dialogs , 2005 .

[109]  Todd R. Ferretti,et al.  Thematic Roles as Verb-specific Concepts , 1997 .

[110]  Melissa Bowerman,et al.  Evaluating competing linguistic models with language acquisition data: Implications of developmental errors with causative verbs. , 1982 .

[111]  Rutvik H. Desai Bootstrapping in miniature language acquisition , 2002, Cognitive Systems Research.

[112]  P. Resnik Selection and information: a class-based approach to lexical relationships , 1993 .

[113]  Kate Nation,et al.  Investigating individual differences in children's real-time sentence comprehension using language-mediated eye movements. , 2003, Journal of experimental child psychology.

[114]  Peter Ford Dominey Learning Grammatical Constructions in a Miniature Language from Narrated Video Events , 2003 .

[115]  Jeffrey Mark Siskind,et al.  Lexical Acquisition in the Presence of Noise and Homonymy , 1994, AAAI.

[116]  Noam Chomsky,et al.  Lectures on Government and Binding , 1981 .

[117]  James L. McClelland,et al.  Mechanisms of Sentence Processing: Assigning Roles to Constituents of Sentences , 1986 .

[118]  Marc Light,et al.  Hiding a Semantic Hierarchy in a Markov Model , 1999, ACL 1999.

[119]  Ted Briscoe,et al.  Lexical Operations in a Unification-based Framework , 1991, SIGLEX Workshop.

[120]  Jeffrey L. Elman,et al.  Distributed Representations, Simple Recurrent Networks, and Grammatical Structure , 1991, Mach. Learn..

[121]  Ray Jackendoff,et al.  Semantic Interpretation in Generative Grammar , 1972 .

[122]  Nancy Chang,et al.  Putting Meaning into Grammar Learning , 2004, Workshop On Psycho-Computational Models Of Human Language Acquisition.

[123]  Ronald W. Langacker,et al.  Grammar and conceptualization , 1999 .

[124]  C. Degueldre,et al.  Sleep after spatial learning promotes covert reorganization of brain activity. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[125]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[126]  E. Markman,et al.  When it is better to receive than to give: Syntactic and conceptual constraints on vocabulary growth , 1994 .

[127]  C. Fisher Structural Limits on Verb Mapping: The Role of Analogy in Children's Interpretations of Sentences , 1996, Cognitive Psychology.

[128]  Michael P Maratsos,et al.  A study in novel word learning: The productivity of the causative , 1987 .

[129]  C. Fillmore,et al.  Regularity and Idiomaticity in Grammatical Constructions: The Case of Let Alone , 1988 .

[130]  T. A. Cartwright,et al.  Syntactic categorization in early language acquisition: formalizing the role of distributional analysis , 1997, Cognition.

[131]  Daniel Jurafsky,et al.  Automatic Labeling of Semantic Roles , 2002, CL.

[132]  Thomas L. Griffiths,et al.  A more rational model of categorization , 2006 .

[133]  M. Tomasello,et al.  A tale of two theories: response to Fisher , 2002, Cognition.

[134]  Afsaneh Fazly,et al.  AUTOMATIC ACQUISITION OF LEXICAL KNOWLEDGE ABOUT , 2007 .