Memory-Bounded Left-Corner Unsupervised Grammar Induction on Child-Directed Input

This paper presents a new memory-bounded left-corner parsing model for unsupervised raw-text syntax induction, using unsupervised hierarchical hidden Markov models (UHHMM). We deploy this algorithm to shed light on the extent to which human language learners can discover hierarchical syntax through distributional statistics alone, by modeling two widely-accepted features of human language acquisition and sentence processing that have not been simultaneously modeled by any existing grammar induction algorithm: (1) a left-corner parsing strategy and (2) limited working memory capacity. To model realistic input to human language learners, we evaluate our system on a corpus of child-directed speech rather than typical newswire corpora. Results beat or closely match those of three competing systems.

[1]  Peter M. Duppenthaler Maturational Constraints on Language Learning , 1990 .

[2]  Philip Resnik,et al.  Left-Corner Parsing and Psychological Plausibility , 1992, COLING.

[3]  Mark Steedman,et al.  Turning the pipeline into a loop: Iterated unsupervised dependency parsing and PoS induction , 2012, HLT-NAACL 2012.

[4]  P. Jusczyk,et al.  Sensitivity to discontinuous dependencies in language learners: evidence for limitations in processing space , 1998, Cognition.

[5]  Dan Klein,et al.  Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency , 2004, ACL.

[6]  Lisa Pearl,et al.  Syntactic Islands and Learning Biases: Combining Experimental Syntax and Computational Modeling to Investigate the Language Acquisition Problem , 2013 .

[7]  SHALOM LAPPIN,et al.  Machine learning theory and practice as a source of insight into universal grammar , 2007 .

[8]  George Hollich,et al.  Early Understanding of Subject and Object Wh-Questions , 2003 .

[9]  Elizabeth K. Johnson,et al.  Statistical learning of tone sequences by human infants and adults , 1999, Cognition.

[10]  Jason Baldridge,et al.  Simple Unsupervised Grammar Induction from Raw Text with Cascaded Finite State Models , 2011, ACL.

[11]  George A. Miller,et al.  Introduction to the Formal Analysis of Natural Languages , 1968 .

[12]  Y. Kareev,et al.  Through a narrow window: Sample size and the perception of correlation , 1997 .

[13]  Richard L. Lewis,et al.  An Activation-Based Model of Sentence Processing as Skilled Memory Retrieval , 2005, Cogn. Sci..

[14]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[15]  Mark Steedman,et al.  The nite connectivity of linguistic structure , 1999 .

[16]  P. Johnson-Laird,et al.  Mental Models: Towards a Cognitive Science of Language, Inference, and Consciousness , 1985 .

[17]  B. MacWhinney The CHILDES project: tools for analyzing talk , 1992 .

[18]  George A. Miller,et al.  Free Recall of Self-Embedded English Sentences , 1964, Inf. Control..

[19]  William Schuler,et al.  A Model of Language Processing as Hierarchic Sequential Prediction , 2013, Top. Cogn. Sci..

[20]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[21]  Mark Steedman,et al.  A Bayesian Mixture Model for PoS Induction Using Multiple Features , 2011, EMNLP.

[22]  Douglas L. T. Rohde,et al.  Less is Less in Language Acquisition , 2001 .

[23]  N. Cowan The magical number 4 in short-term memory: A reconsideration of mental storage capacity , 2001, Behavioral and Brain Sciences.

[24]  Yee Whye Teh,et al.  Beam sampling for the infinite hidden Markov model , 2008, ICML '08.

[25]  Edward Gibson,et al.  A computational theory of human linguistic processing: memory limitations and processing breakdown , 1991 .

[26]  Mark Johnson,et al.  Memory requirements and local ambiguities of parsing strategies , 1991 .

[27]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[28]  Dan Klein,et al.  A Generative Constituent-Context Model for Improved Grammar Induction , 2002, ACL.

[29]  Giorgio Satta,et al.  Theory of Parsing , 2010 .

[30]  Yoav Seginer,et al.  Fast Unsupervised Incremental Parsing , 2007, ACL.

[31]  Fred Karlsson,et al.  Constraints on multiple center-embedding of clauses , 2007 .

[32]  N PETRILOWITSCH,et al.  [The development of memory]. , 1956, Archiv fur Psychiatrie und Nervenkrankheiten, vereinigt mit Zeitschrift fur die gesamte Neurologie und Psychiatrie.

[33]  B McElree,et al.  Working memory and focal attention. , 2001, Journal of experimental psychology. Learning, memory, and cognition.

[34]  J. Elman Learning and development in neural networks: the importance of starting small , 1993, Cognition.

[35]  Zoubin Ghahramani,et al.  The infinite HMM for unsupervised PoS tagging , 2009, EMNLP.

[36]  Alfred V. Aho,et al.  The Theory of Parsing, Translation, and Compiling , 1972 .

[37]  Michael Collins,et al.  A Statistical Parser for Czech , 1999, ACL.

[38]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[39]  H. Gleitman,et al.  Mother, Id rather do it myself: Some effects and non-effects of maternal speech style , 1977 .