Learning Phone Embeddings for Word Segmentation of Child-Directed Speech

This paper presents a novel model that learns and exploits embeddings of phone ngrams for word segmentation in child language acquisition. Embedding-based models are evaluated on a phonemically transcribed corpus of child-directed speech, in comparison with their symbolic counterparts using the common learning framework and features. Results show that learning embeddings significantly improves performance. We make use of extensive visualization to understand what the model has learned. We show that the learned embeddings are informative for both word segmentation and phonology in general.

[1]  Çagri Çöltekin,et al.  Catching words in a stream of speach : computational simulations of segmenting transcribed child-directed speech , 2011 .

[2]  C. J. van Rijsbergen,et al.  Information Retrieval , 1979, Encyclopedia of GIS.

[3]  Barbara Höhle,et al.  Metrical and statistical cues for word segmentation : the use of vowel harmony and word stress as a cue to word boundaries by 6- and 9-month-old Turkish learners , 2008 .

[4]  Mark Johnson,et al.  Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars , 2009, NAACL.

[5]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[6]  J. Morgan,et al.  SIGNAL TO SYNTAX : Bootstrapping From Speech to Grammar in Early Acquisition , 2008 .

[7]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[8]  Morten H. Christiansen,et al.  Learning to Segment Speech Using Multiple Cues: A Connectionist Model , 1998 .

[9]  Richard N. Aslin,et al.  Models of Word Segmentation in Fluent Maternal Speech to Infants , 2014 .

[10]  T. A. Cartwright,et al.  Distributional regularity and phonotactic constraints are useful for segmentation , 1996, Cognition.

[11]  Baobao Chang,et al.  Max-Margin Tensor Neural Network for Chinese Word Segmentation , 2014, ACL.

[12]  John Nerbonne,et al.  Exploring Phonotactics with Simple Recurrent Networks , 1999 .

[13]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Erhard W. Hinrichs,et al.  Accurate Linear-Time Chinese Word Segmentation via Embedding Matching , 2015, ACL.

[16]  Michael R. Brent,et al.  An Efficient, Probabilistically Sound Algorithm for Segmentation and Word Discovery , 1999, Machine Learning.

[17]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[18]  Xiaoqing Zheng,et al.  Deep Learning for Chinese Word Segmentation and POS Tagging , 2013, EMNLP.

[19]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[20]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[21]  Marina Nespor,et al.  An interaction between prosody and statistics in the segmentation of fluent speech , 2007, Cognitive Psychology.

[22]  M. Goldsmith,et al.  Statistical Learning by 8-Month-Old Infants , 1996 .

[23]  A. Cutler,et al.  Vowel harmony and speech segmentation in Finnish , 1997 .

[24]  A. Cutler,et al.  Rhythmic cues to speech segmentation: Evidence from juncture misperception , 1992 .

[25]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[26]  Margaret M. Fleck Lexicalized Phonotactic Word Segmentation , 2008, ACL.

[27]  Sophia Ananiadou,et al.  Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty , 2009, ACL.

[28]  P. Jusczyk,et al.  Infants' preference for the predominant stress patterns of English words. , 1993, Child development.

[29]  Morten H. Christiansen,et al.  Multiple-Cue Integration in Language Acquisition : A Connectionist Model of Speech Segmentation and Rule-like Behavior , 2004 .

[30]  John Nerbonne,et al.  An explicit statistical model of learning lexical segmentation using multiple cues , 2014, EACL 2014.

[31]  Anand Venkataraman,et al.  A Statistical Model for Word Discovery in Transcribed Speech , 2001, CL.

[32]  Laurence White,et al.  Integration of multiple speech segmentation cues: a hierarchical framework. , 2005, Journal of experimental psychology. General.

[33]  Morten H. Christiansen,et al.  Words in puddles of sound: modelling psycholinguistic effects in speech segmentation. , 2010, Journal of child language.

[34]  Aris Xanthos An Incremental Implementation of the Utterance-Boundary Approach to Speech Segmentation , 2003, CLIN.

[35]  Robert Daland,et al.  Learning Diphone-Based Segmentation , 2011, Cogn. Sci..

[36]  C Snow,et al.  Child language data exchange system , 1984, Journal of Child Language.

[37]  T. Griffiths,et al.  A Bayesian framework for word segmentation: Exploring the effects of context , 2009, Cognition.

[38]  Mary R. Newsome,et al.  The Beginnings of Word Segmentation in English-Learning Infants , 1999, Cognitive Psychology.