Pitch declination and reset as a function of utterance duration in conversational speech data

This paper describes the declination trends of f0 in conversational speech data. A 10-minute dialogue interaction from a corpus of spontaneous speech was annotated to identify intersilence units (ISU) and turns. Detailed annotation of the ISUs was conducted in terms of communicative types and pitch patterns. f0 declination was measured by (1) fitting a regression line to f0 trajectories and (2) by fitting additional regression lines to the data points below and above the original (central) regression line. The slope of declination as well as the height of ISU/turn-initial f0 peak were examined as a function of the duration of the ISU or turn. The results suggest that declination is indeed present in conversational speech data, at the level of both the ISU and the turn (73% of the analysed ISUs exhibited negative f0 declination slope). There is a tendency for the steepness of the slope to decrease and the height of ISturn-initial f0 peak to increase as the duration of the ISU or turn increases. The results are discussed in the context of Projection and Reaction theories and of Hard vs. Soft preplanning of speech production. The findings are of potential interest for the development of human-machine dialogue systems.

[1]  Céline de Looze Analyse et Interprétation de l'Empan Temporel des Variations Prosodiques en Français et en Anglais , 2010 .

[2]  Elizabeth Couper-Kuhlen,et al.  Interactional prosody: High onsets in reason-for-the-call turns , 2001, Language in Society.

[3]  A. Cutler,et al.  On The Analysis of Prosodic Turn-Taking Cues , 2018, Intonation in Discourse.

[4]  Irena Yanushevskaya,et al.  Pitch range declination and reset in turn-taking organisation , 2014 .

[5]  Mark Liberman,et al.  F0 declination in English and Mandarin Broadcast News Speech , 2014, Speech Commun..

[6]  John Local,et al.  Projection and ‘silences’: Notes on phonetic and conversational structure , 1986 .

[7]  Bruce Connell,et al.  Tone , Utterance Length and F 0 Scaling , 2004 .

[8]  Jacques M. B. Terken,et al.  Prosodic realizations of global and local structure and rhetorical relations in read aloud news reports , 2009, Speech Commun..

[9]  Annie Rialland Anticipatory raising in downstep realization : evidence for preplanning in tone production , 2001 .

[10]  Daniel Hirst,et al.  Symbolic coding of higher-level characteristics of fundamental frequency curves , 1995, EUROSPEECH.

[11]  E. Schegloff,et al.  A simplest systematics for the organization of turn-taking for conversation , 2015 .

[12]  S. Duncan,et al.  Some Signals and Rules for Taking Speaking Turns in Conversations , 1972 .

[13]  V. Yngve On getting a word in edgewise , 1970 .

[14]  Philip Lieberman,et al.  Intonation, Perception and Language , 1968 .

[15]  E. Schegloff Discourse as an interactional achievement : Some uses of "Uh huh" and other things that come between sentences , 1982 .

[16]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[17]  A. Ichikawa,et al.  An Analysis of Turn-Taking and Backchannels Based on Prosodic and Syntactic Features in Japanese Map Task Dialogs , 1998, Language and speech.

[18]  Shinji Maeda,et al.  A characterization of American English intonation , 1976 .

[19]  A. Kendon Some functions of gaze-direction in social interaction. , 1967, Acta psychologica.

[20]  Johanneke Caspers,et al.  Local speech melody as a limiting factor in the turn-taking system in Dutch , 2003, J. Phonetics.

[21]  Jürgen Trouvain,et al.  The Effect of Tempo on Prosodic Structure , 1999 .

[22]  J.J.M. Haan-van Ditzhuyzen Speaking of questions: an exploration of Dutch question intonation , 2001 .

[23]  J M Terken,et al.  Beyond Sentence Prosody: Paragraph Intonation in Dutch , 1993, Phonetica.

[24]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[25]  Ailbhe Ní Chasaide,et al.  The distribution of pitch patterns and communicative types in speech-chunks preceding pauses and gaps , 2014 .

[26]  Julia Hirschberg,et al.  Turn-taking cues in task-oriented dialogue , 2011, Comput. Speech Lang..

[27]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[28]  John Hart,et al.  A Perceptual Study of Intonation , 1990 .

[29]  Jan-Peter de Holger N. J. Ruiter,et al.  Projecting the End of a Speaker's Turn: A Cognitive Cornerstone of Conversation , 2006 .

[30]  Maxine Eskénazi,et al.  Optimizing Endpointing Thresholds using Dialogue Features in a Spoken Dialogue System , 2008, SIGDIAL Workshop.

[31]  M. Selting On the Interplay of Syntax and Prosody in the Constitution of Turn-Constructional Units and Turns in Conversation , 1996 .

[32]  G. Bruce,et al.  Textual Aspects of Prosody in Swedish , 1982, Phonetica.

[33]  R. Collier Physiological, correlates of intonation patterns. , 1975, The Journal of the Acoustical Society of America.

[34]  Barbara A. Fox,et al.  Practices in the Construction of Turns: The "TCU" Revisited , 1996 .

[35]  Cecilia E. Ford,et al.  Interaction and grammar: Interactional units in conversation: syntactic, intonational, and pragmatic resources for the management of turns , 1996 .

[36]  John J. Ohala,et al.  Respiratory Activity in Speech , 1990 .

[37]  B. Vaughan,et al.  Naturalistic Emotional Speech Corpora with Large Scale Emotional Dimension Ratings , 2011 .