Modeling Infant Word Segmentation

While many computational models have been created to explore how children might learn to segment words, the focus has largely been on achieving higher levels of performance and exploring cues suggested by artificial learning experiments. We propose a broader focus that includes designing models that display properties of infants' performance as they begin to segment words. We develop an efficient bootstrapping online learner with this focus in mind, and evaluate it on child-directed speech. In addition to attaining a high level of performance, this model predicts the error patterns seen in infants learning to segment words.

[1]  Mark Johnson,et al.  Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars , 2009, NAACL.

[2]  G. Kane Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .

[3]  A. Cutler,et al.  Rhythmic cues to speech segmentation: Evidence from juncture misperception , 1992 .

[4]  E. Newport,et al.  Computation of Conditional Probability Statistics by 8-Month-Old Infants , 1998 .

[5]  R N Aslin,et al.  Statistical Learning by 8-Month-Old Infants , 1996, Science.

[6]  Charles Yang,et al.  Recession Segmentation: Simpler Online Word Segmentation Using Limited Resources , 2010, CoNLL.

[7]  Charles D. Yang Universal Grammar, statistics or both? , 2004, Trends in Cognitive Sciences.

[8]  Lou Boves,et al.  Acoustic characteristics of lexical stress in continuous telephone speech , 1999, Speech Commun..

[9]  Ann M. Peters,et al.  The Units of Language Acquisition , 1983 .

[10]  M. Brent,et al.  The role of exposure to isolated words in early vocabulary development , 2001, Cognition.

[11]  J. Morgan,et al.  SIGNAL TO SYNTAX : Bootstrapping From Speech to Grammar in Early Acquisition , 2008 .

[12]  E. Newport,et al.  WORD SEGMENTATION : THE ROLE OF DISTRIBUTIONAL CUES , 1996 .

[13]  Mark Johnson,et al.  Using Adaptor Grammars to Identify Synergies in the Unsupervised Acquisition of Linguistic Structure , 2008, ACL.

[14]  T. Griffiths,et al.  A Bayesian framework for word segmentation: Exploring the effects of context , 2009, Cognition.

[15]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[16]  M. Halle,et al.  An essay on stress , 1987 .

[17]  C. Fisher,et al.  Learning phonotactic constraints from brief auditory experience , 2002, Cognition.

[18]  Steven Pinker,et al.  Language learnability and language development , 1985 .

[19]  J. Morgan,et al.  Mommy and Me , 2005, Psychological science.

[20]  Timothy Gambell,et al.  Statistics Learning and Universal Grammar: Modeling Word Segmentation , 2004 .

[21]  Jenny R Saffran,et al.  Words in a sea of sounds: the output of infant statistical learning , 2001, Cognition.

[22]  Mary R. Newsome,et al.  The Beginnings of Word Segmentation in English-Learning Infants , 1999, Cognitive Psychology.

[23]  Amanda Seidl,et al.  Infant word segmentation revisited: edge alignment facilitates target extraction. , 2006, Developmental science.

[24]  B. MacWhinney The CHILDES project: tools for analyzing talk , 1992 .

[25]  Erik D. Thiessen,et al.  When cues collide: use of stress and statistical cues to word boundaries by 7- to 9-month-old infants. , 2003, Developmental psychology.

[26]  Michael R. Brent,et al.  An Efficient, Probabilistically Sound Algorithm for Segmentation and Word Discovery , 1999, Machine Learning.

[27]  Charles D. Yang,et al.  Knowledge and learning in natural language , 2000 .

[28]  R. Brown,et al.  A First Language , 1973 .

[29]  Jacques Mehler,et al.  How do 4-day-old infants categorize multisyllabic utterances? , 1993 .

[30]  Paul R. Cohen,et al.  Word Segmentation as General Chunking , 2011, CoNLL.

[31]  Anand Venkataraman,et al.  A Statistical Model for Word Discovery in Transcribed Speech , 2001, CL.

[32]  A. Vinter,et al.  PARSER: A Model for Word Segmentation , 1998 .

[33]  Laurence White,et al.  Integration of multiple speech segmentation cues: a hierarchical framework. , 2005, Journal of experimental psychology. General.

[34]  Michael C. Frank,et al.  Modeling human performance in statistical word segmentation , 2010, Cognition.

[35]  Philipp Koehn,et al.  Empirical Methods for Compound Splitting , 2003, EACL.