Prosodic boundary information helps unsupervised word segmentation

It is well known that prosodic information is used by infants in early language acquisition. In particular, prosodic boundaries have been shown to help infants with sentence and wordlevel segmentation. In this study, we extend an unsupervised method for word segmentation to include information about prosodic boundaries. The boundary information used was either derived from oracle data (handannotated), or extracted automatically with a system that employs only acoustic cues for boundary detection. The approach was tested on two different languages, English and Japanese, and the results show that boundary information helps word segmentation in both cases. The performance gain obtained for two typologically distinct languages shows the robustness of prosodic information for word segmentation. Furthermore, the improvements are not limited to the use of oracle information, similar performances being obtained also with automatically extracted boundaries.

[1]  Luc Boruta,et al.  Indicators of Allophony and Phonemehood , 2012 .

[2]  WhyIsEnglishSoEasyToSegment ? , .

[3]  J. Mehler,et al.  Do infants perceive word boundaries? An empirical study of the bootstrapping of lexical acquisition. , 1994, The Journal of the Acoustical Society of America.

[4]  Valérie Hazan,et al.  LUCID: a corpus of spontaneous and read clear speech in British English , 2010, DiSS-LPSS.

[5]  Mark Liberman,et al.  Speaker identification on the SCOTUS corpus , 2008 .

[6]  Jui Ting Huang,et al.  Unsupervised Prosodic Break Detection in Mandarin Speech , 2008 .

[7]  Hideaki Kikuchi,et al.  X-JToBI: an extended j-toBI for spontaneous speech , 2002, INTERSPEECH.

[8]  J. Mehler,et al.  Phonological phrase boundaries constrain lexical access II. Infant data , 2004 .

[9]  Isabell Wartenburger,et al.  How Each Prosodic Boundary Cue Matters: Evidence from German Infants , 2012, Front. Psychology.

[10]  Yang Liu,et al.  Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm , 2009, ACL.

[11]  Mark Johnson,et al.  Unsupervised Word Segmentation in Context , 2014, COLING.

[12]  A. Woodward,et al.  Perception of acoustic correlates of major phrasal units by young infants , 1992, Cognitive Psychology.

[13]  Hsiao-Wuen Hon,et al.  Speaker-independent phone recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[14]  Thomas L. Griffiths,et al.  Adaptor Grammars: A Framework for Specifying Compositional Nonparametric Bayesian Models , 2006, NIPS.

[15]  Yee Whye Teh,et al.  A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes , 2006, ACL.

[16]  Mark Johnson,et al.  Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars , 2009, NAACL.

[17]  A. Christophe,et al.  Bootstrapping lexical acquisition: The role of prosodic structure , 1996 .

[18]  P. Jusczyk,et al.  When prosody fails to cue syntactic structure: 9-month-olds' sensitivity to phonological versus syntactic phrases , 1994, Cognition.

[19]  Bogdan Ludusan,et al.  Incorporating Prosodic Boundaries in Unsupervised Term Discovery , 2014 .

[20]  Mari Ostendorf,et al.  Automatic recognition of prosodic phrases , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[21]  J. Mehler,et al.  Perception of Prosodic Boundary Correlates by Newborn Infants. , 2001, Infancy : the official journal of the International Society on Infant Studies.

[22]  Mark Johnson,et al.  Modelling function words improves unsupervised word segmentation , 2014, ACL.

[23]  J. Pitman,et al.  Size-biased sampling of Poisson point processes and excursions , 1992 .

[24]  Mark Johnson,et al.  Unsupervised Word Segmentation for Sesotho Using Adaptor Grammars , 2008, SIGMORPHON.

[25]  Mark Johnson,et al.  Exploring the Role of Stress in Bayesian Word Segmentation using Adaptor Grammars , 2014, TACL.

[26]  T. Griffiths,et al.  A Bayesian framework for word segmentation: Exploring the effects of context , 2009, Cognition.

[27]  Shrikanth S. Narayanan,et al.  Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[28]  K. Maekawa CORPUS OF SPONTANEOUS JAPANESE : ITS DESIGN AND EVALUATION , 2003 .

[29]  A. Seidl Infants’ use and weighting of prosodic cues in clause segmentation , 2007 .

[30]  Emmanuel Dupoux,et al.  Reflexions on prosodic bootstrapping: its role for lexical and syntactic acquisition , 1997 .

[31]  Mari Ostendorf,et al.  TOBI: a standard for labeling English prosody , 1992, ICSLP.

[32]  Thomas L. Griffiths,et al.  Producing Power-Law Distributions and Damping Word Frequencies with Two-Stage Language Models , 2011, J. Mach. Learn. Res..

[33]  Bogdan Ludusan,et al.  Towards low-resource prosodic boundary detection , 2014, SLTU.