Speaking style effects in the production of disfluencies

Abstract This work explores speaking style effects in the production of disfluencies. University lectures and map-task dialogues are analyzed in order to evaluate if the prosodic strategies used when uttering disfluencies vary across speaking styles. Our results show that the distribution of disfluency types is not arbitrary across lectures and dialogues. Moreover, although there is a statistically significant cross-style strategy of prosodic contrast marking (pitch and energy increases) between the region to repair and the repair of fluency, this strategy is displayed differently depending on the specific speech task. The overall patterns observed in the lectures, with regularities ascribed for speaker and disfluency types, do not hold with the same strength for the dialogues, due to underlying specificities of the communicative purposes. The tempo patterns found for both speech tasks also confirm their distinct behaviour, evidencing the more dynamic tempo characteristics of dialogues. In university lectures, prosodic cues are given to the listener both for the units inside disfluent regions and between these and the adjacent contexts. This suggests a stronger prosodic contrast marking of disfluency–fluency repair when compared to dialogues, as if teachers were monitoring the different regions – the introduction to a disfluency, the disfluency itself and the beginning of the repair – demarcating them in very contrastive ways.

[1]  Hans W. Dechert,et al.  Towards a cross-linguistic assessment of speech production , 1980 .

[2]  Ana Isabel Mata,et al.  Extending automatic transcripts in a unified data representation towards a prosodic-based metadata annotation and evaluation , 2021, Journal of Speech Sciences.

[3]  C H Nakatani,et al.  A corpus-based study of repair cues in spontaneous speech. , 1994, The Journal of the Acoustical Society of America.

[4]  J. E. Tree The Effects of False Starts and Repetitions on the Processing of Subsequent Words in Spontaneous Speech , 1995 .

[5]  Helena Gorete,et al.  CONTRIBUTO PARA A CARACTERIZAÇÃO DOS MECANISMOS DE (DIS)FLUÊNCIA NO PORTUGUÊS EUROPEU , 2006 .

[6]  Elisabeth Schriberg,et al.  Preliminaries to a Theory of Speech Disfluencies , 1994 .

[7]  Joakim Nivre,et al.  ON THE NON -W RITTEN LIFE OF SPEECH , 1990 .

[8]  Jennifer E. Arnold,et al.  Disfluencies Signal Theee, Um, New Information , 2003, Journal of psycholinguistic research.

[9]  Julia Hirschberg,et al.  Entrainment in spontaneous speech: The case of filled pauses in Supreme Court hearings , 2012, 2012 IEEE 3rd International Conference on Cognitive Infocommunications (CogInfoCom).

[10]  A. E. Hieke A Content-Processing View of Hesitation Phenomena , 1981 .

[11]  Joakim Gustafson,et al.  Web-based educational tools for speech technology , 1998, ICSLP.

[12]  Ana Isabel Mata,et al.  On the Intonation of Confirmation-Seeking Requests in Child-Directed Speech , 2010 .

[13]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[14]  Helena Moniz,et al.  Bilingual Experiments on Automatic Recovery of Capitalization and Punctuation of Automatic Speech Transcripts , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Anne Cutler,et al.  Prosodic marking in speech repair , 1983 .

[16]  Björn W. Schuller,et al.  Paralinguistics in speech and language - State-of-the-art and the challenge , 2013, Comput. Speech Lang..

[17]  Ciro Martins,et al.  Broadcast news subtitling system in Portuguese , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Julia Hirschberg A Corpus-Based Approach to the Study of Speaking Style , 2000 .

[19]  Maxine Eskénazi,et al.  Trends in speaking styles research , 1993, EUROSPEECH.

[20]  Michael Erard,et al.  Um: Slips, Stumbles, and Verbal Blunders, and What They Mean , 2007 .

[21]  Julia Hirschberg,et al.  Acoustic and Prosodic Correlates of Social Behavior , 2011, INTERSPEECH.

[22]  Douglas Biber,et al.  Variation across speech and writing: Methodology , 1988 .

[23]  Fernando Batista,et al.  Recovering Capitalization and Punctuation Marks on Speech Transcriptions , 2011 .

[24]  Elizabeth Shriberg,et al.  Phonetic Consequences of Speech Disfluency , 1999 .

[25]  Joakim Nivre,et al.  Speech Management—on the Non-written Life of Speech , 1990, Nordic Journal of Linguistics.

[26]  Andreas Stolcke,et al.  A study in machine learning from imbalanced data for sentence boundary detection in speech , 2006, Comput. Speech Lang..

[27]  Daniel C. O'Connell,et al.  Communicating with One Another: Toward a Psychology of Spontaneous Spoken Discourse , 2008 .

[28]  Eleonora Blaauw,et al.  On the perceptual classification of spontaneous and read speech , 1995 .

[29]  Ricardo Ribeiro,et al.  Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity , 2011, J. Artif. Intell. Res..

[30]  Ralph L. Rose THE COMMUNICATIVE VALUE OF FILLED PAUSES IN SPONTANEOUS SPEECH , 1998 .

[31]  S. Brennan,et al.  How Listeners Compensate for Disfluencies in Spontaneous Speech , 2001 .

[32]  Isabel Trancoso,et al.  Topic segmentation and indexation in a media watch system , 2008, INTERSPEECH.

[33]  Mark Hasegawa-Johnson,et al.  Prosodic parallelism as a cue to repetition and error correction disfluency , 2005, DiSS.

[34]  W. Levelt,et al.  Monitoring and self-repair in speech , 1983, Cognition.

[35]  J. Vaissière Perception of Intonation , 2008 .

[36]  H. H. Clark,et al.  Using uh and um in spontaneous speaking , 2002, Cognition.

[37]  M. Swerts Filled pauses as markers of discourse structure , 1998 .

[38]  Daniel Jurafsky,et al.  Detecting friendly, flirtatious, awkward, and assertive speech in speed-dates , 2013, Comput. Speech Lang..

[39]  R. Eklund Disfluency in Swedish human–human and human–machine travel booking dialogues , 2004 .

[40]  Fernando M. Silva,et al.  AUTOMATIC ALIGNMENT OF MAP TASK DIALOGS USING WFSTS , 2000 .

[41]  Helena Moniz,et al.  The LECTRA Corpus - Classroom Lecture Transcriptions in European Portuguese , 2008, LREC.

[42]  Elizabeth Shriberg To ‘errrr’ is human: ecology and acoustics of speech disfluencies , 2001, Journal of the International Phonetic Association.

[43]  Guergana K. Savova,et al.  Prosodic features of four types of disfluencies , 2003, DiSS.

[44]  Madelaine C. Plauché,et al.  DATA-DRIVEN SUBCLASSIFICATION OF DISFLUENT REPETITIONS BASED ON PROSODIC FEATURES , 1999 .

[45]  Guergana K. Savova,et al.  Designing for errors: similarities and differences of disfluency rates and prosodic characteristics across domains , 2003, INTERSPEECH.

[46]  Donald Hindle,et al.  Deterministic Parsing of Syntactic Non-fluencies , 1983, ACL.

[47]  L. Boves,et al.  Quantitative assessment of second language learners' fluency: comparisons between read and spontaneous speech. , 2002, The Journal of the Acoustical Society of America.