Context Cues For Classification Of Competitive And Collaborative Overlaps

Being able to respond appropriately to users’ overlaps should be seen as one of the core competencies of incremental dialogue systems. At the same time identifying whether an interlocutor wants to support or grab the turn is a task which comes natu- rally to humans, but has not yet been implemented in such sys- tems. Motivated by this we first investigate whether prosodic characteristics of speech in the vicinity of overlaps are signifi- cantly different from prosodic characteristics in the vicinity of non-overlapping speech. We then test the suitability of differ- ent context sizes, both preceding and following but excluding features of the overlap, for the automatic classification of col- laborative and competitive overlaps. We also test whether the fusion of preceding and succeeding contexts improves the clas- sification. Preliminary results indicate that the optimal context for classification of overlap lies at 0.2 seconds preceding the overlap and up to 0.3 seconds following it. We demonstrate that we are able to classify collaborative and competitive overlap with a median accuracy of 63%.

[1]  K. Murata Intrusive or co-operative? A cross-cultural study of interruption , 1994 .

[2]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[3]  Catharine Oertel Gen Bierbach On the use of multimodal cues for the prediction of involvement in spontaneous conversation , 2011 .

[4]  Nick Campbell,et al.  On the Use of Multimodal Cues for the Prediction of Degrees of Involvement in Spontaneous Conversation , 2011, INTERSPEECH.

[5]  Mattias Heldner,et al.  Pauses, gaps and overlaps in conversations , 2010, J. Phonetics.

[6]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[7]  References , 1971 .

[8]  Marianthi Makri-Tsilipakou,et al.  Interruption revisited: Affiliative vs. disaffiliative intervention , 1994 .

[9]  Günther Palm,et al.  Multimodal Laughter Detection in Natural Discourses , 2009, Human Centered Robot Systems, Cognition, Interaction, Technology.

[10]  Petra Wagner,et al.  D64: a corpus of richly recorded conversational interaction , 2013, Journal on Multimodal User Interfaces.

[11]  Guy J. Brown,et al.  Resources for turn competition in overlap in multi-party conversations: speech rate, pausing and duration , 2010, INTERSPEECH.

[12]  Peter French,et al.  Turn-competitive incomings , 1983 .

[13]  S. MacFarlane,et al.  Prosody as an Interactional Resource: Turn-projection and Overlap , 1998, Language and speech.

[14]  Fan Yang,et al.  Initiative conflicts in task-oriented dialogue , 2010, Comput. Speech Lang..

[15]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[16]  Gabriel Skantze,et al.  A General, Abstract Model of Incremental Dialogue Processing , 2009, EACL.

[17]  E. Schegloff,et al.  A simplest systematics for the organization of turn-taking for conversation , 1974 .

[18]  Guy J. Brown,et al.  Fundamental Frequency Height as a Resource for the Management of Overlap in Talk-in-Interaction , 2009 .