A joint model of word segmentation and phonological variation for English word-final /t/-deletion

Word-final /t/-deletion refers to a common phenomenon in spoken English where words such as /wEst/ “west” are pronounced as [wEs] “wes” in certain contexts. Phonological variation like this is common in naturally occurring speech. Current computational models of unsupervised word segmentation usually assume idealized input that is devoid of these kinds of variation. We extend a non-parametric model of word segmentation by adding phonological rules that map from underlying forms to surface forms to produce a mathematically well-defined joint model as a first step towards handling variation and segmentation in a single model. We analyse how our model handles /t/-deletion on a large corpus of transcribed speech, and show that the joint model can perform word segmentation and recover underlying /t/s. We find that Bigram dependencies are important for performing well on real data and for learning appropriate deletion probabilities for different contexts. 1

[1]  D. Norris,et al.  The Possible-Word Constraint in the Segmentation of Continuous Speech , 1997, Cognitive Psychology.

[2]  T. Griffiths,et al.  A Bayesian framework for word segmentation: Exploring the effects of context , 2009, Cognition.

[3]  Noam Chomsky,et al.  The Sound Pattern of English , 1968 .

[4]  Andries W. Coetzee,et al.  What it Means to be a Loser: Non-optimal Candidates in Optimality Theory , 2004 .

[5]  Laura C. Dilley,et al.  Phonetic variation in consonants in infant-directed and adult-directed speech: the case of regressive place assimilation in word-final alveolar stops* , 2013, Journal of Child Language.

[6]  Mark Johnson,et al.  Using Adaptor Grammars to Identify Synergies in the Unsupervised Acquisition of Linguistic Structure , 2008, ACL.

[7]  Radford M. Neal Slice Sampling , 2003, The Annals of Statistics.

[8]  P. Smolensky,et al.  Optimality Theory: Constraint Interaction in Generative Grammar , 2004 .

[9]  Mark Johnson,et al.  Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars , 2009, NAACL.

[10]  Sharon Goldwater,et al.  Improving morphology induction by learning spelling rules , 2009, IJCAI 2009.

[11]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[12]  Thomas L. Griffiths,et al.  Adaptor Grammars: A Framework for Specifying Compositional Nonparametric Bayesian Models , 2006, NIPS.

[13]  Andries W. Coetzee,et al.  Frequency biases in phonological variation , 2013 .

[14]  Micha Elsner,et al.  Bootstrapping a Unified Model of Lexical and Phonetic Acquisition , 2012, ACL.

[15]  Gregory R. Guy Contextual conditioning in variable lexical phonology , 1991, Language Variation and Change.