Agreement and disagreement utterance detection in conversational speech by extracting and integrating local features

This paper presents a novel framework to automatically detect agreement and disagreement utterances in natural conversation. Such a function is critical for conversation understanding such as meeting summarization. One of the difficulties of agreement and disagreement utterance detection in natural conversation is ambiguity in the utterance unit. Utterances are usually segmented by short pauses. However, in conversations, multiple sentences are often uttered in one breath. Such utterances exhibit the characteristics of agreement and disagreement only in some parts, not the whole utterance. This makes conventional methods problematic since they assume each utterance is just one sentence and extract global features from the whole utterance. To deal with this problem, we propose a detection framework that utilizes only local prosodic/lexical features. The local features are extracted from short windows that cover just a few words. Posteriors of agreement, disagreement and others are estimated window-by-window and integrated to yield a final decision. Experiments on free discussion speech show that the proposed method, through its use of local features, offers significantly higher accuracy in detecting agreement and disagreement utterances.

[1]  Richard E. Ladner,et al.  Agreement/Disagreement Classification: Exploiting Unlabeled Data using Contrast Classifiers , 2006, HLT-NAACL.

[2]  Kristin Precoda,et al.  Detection of Agreement and Disagreement in Broadcast Conversations , 2011, ACL.

[3]  Bernice W. Polemis Nonparametric Statistics for the Behavioral Sciences , 1959 .

[4]  Theresa Wilson,et al.  Agreement detection in multiparty conversation , 2009, ICMI-MLMI '09.

[5]  Steve Renals,et al.  Incorporating lexical and prosodic information at different levels for meeting summarization , 2014, INTERSPEECH.

[6]  Shrikanth Narayanan,et al.  Detecting prominence in conversational speech: pitch accent, givenness and focus , 2008, Speech Prosody 2008.

[7]  Maja Pantic,et al.  Modeling hidden dynamics of multimodal cues for spontaneous agreement and disagreement recognition , 2011, Face and Gesture 2011.

[8]  Tetsunori Kobayashi,et al.  Prosody based attitude recognition with feature selection and its application to spoken dialog system as para-linguistic information , 2004, INTERSPEECH.

[9]  Kristin Precoda,et al.  Identifying Agreement/Disagreement in Conversational Speech: A Cross-Lingual Study , 2011, INTERSPEECH.

[10]  Julia Hirschberg,et al.  Identifying Agreement and Disagreement in Conversational Speech: Use of Bayesian Networks to Model Pragmatic Dependencies , 2004, ACL.

[11]  Mari Ostendorf,et al.  Detection Of Agreement vs. Disagreement In Meetings: Training With Unlabeled Data , 2003, NAACL.

[12]  Nathalie Japkowicz,et al.  The Class Imbalance Problem: Significance and Strategies , 2000 .

[13]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[14]  Eva Strangert,et al.  Emphasis by Pausing , 2003 .