Synthesis of filled pauses based on a disfluent speech model

In the present paper we present a new approach to the synthesis of filled pauses. The problem is tackled from the point of view of disfluent speech synthesis. Based on the synthetic disfluent speech model, we analyse the features that describe filled pauses and propose a model to predict them. The model was implemented and perceptually evaluated with successful results.

[1]  Rolf Carlson,et al.  Cues for hesitation in speech synthesis , 2006, INTERSPEECH.

[2]  J. E. Tree The Effects of False Starts and Repetitions on the Processing of Subsequent Words in Spontaneous Speech , 1995 .

[3]  Shu-Chuan Tseng Grammar, prosody and speech disfluencies in spoken dialogues , 1999 .

[4]  E. Eide,et al.  Conversational computers. , 2005, Scientific American.

[5]  David Escudero Mancebo,et al.  On the generation of synthetic disfluent speech: local prosodic modifications caused by the insertion of editing terms , 2008, INTERSPEECH.

[6]  Victoria Arranz,et al.  Lexica and corpora for speech-to-speech translation: a trilingual approach , 2003, INTERSPEECH.

[7]  Herbert H. Clark,et al.  Speaking in time , 2002, Speech Commun..

[8]  Elisabeth Schriberg,et al.  Preliminaries to a Theory of Speech Disfluencies , 1994 .

[9]  Shrikanth S. Narayanan,et al.  An empirical text transformation method for spontaneous speech synthesizers , 2003, INTERSPEECH.

[10]  Antonio Bonafonte,et al.  Prosodic Analysis and Modelling of Conversational Elements for Speech Synthesis. , 2009 .

[11]  Keikichi Hirose,et al.  Filled pauses as cues to the complexity of following phrases , 2005, INTERSPEECH.

[12]  Simon King,et al.  The Blizzard Challenge 2009 , 2009 .

[13]  Elizabeth Shriberg,et al.  Phonetic Consequences of Speech Disfluency , 1999 .

[14]  Antonio Bonafonte,et al.  Ogmios: The UPC Text-to-Speech synthesis system for Spoken Translation , 2006 .

[15]  Elizabeth Shriberg DISFLUENCIES IN SWITCHBOARD , 1996 .

[16]  Simon King,et al.  The Blizzard Challenge 2008 , 2008 .

[17]  D. O’connell,et al.  The History of Research on the Filled Pause as Evidence of The Written Language Bias in Linguistics (Linell, 1982) , 2004, Journal of psycholinguistic research.

[18]  Andreas Stolcke,et al.  A prosody only decision-tree model for disfluency detection , 1997, EUROSPEECH.