Contrasting Multi-Lingual Prosodic Cues to Predict Verbal Feedback for Rapport

Verbal feedback is an important information source in establishing interactional rapport. However, predicting verbal feedback across languages is challenging due to language-specific differences, inter-speaker variation, and the relative sparseness and optionality of verbal feedback. In this paper, we employ an approach combining classifier weighting and SMOTE algorithm oversampling to improve verbal feedback prediction in Arabic, English, and Spanish dyadic conversations. This approach improves the prediction of verbal feedback, up to 6-fold, while maintaining a high overall accuracy. Analyzing highly weighted features highlights widespread use of pitch, with more varied use of intensity and duration.

[1]  L. Tickle-Degnen,et al.  The Nature of Rapport and Its Nonverbal Correlates , 1990 .

[2]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[3]  A. Kendon Gesture: Visible Action as Utterance , 2004 .

[4]  Gina-Anne Levow,et al.  Cross-cultural investigation of prosody in verbal feedback in interactional rapport , 2010, INTERSPEECH.

[5]  S. Duncan,et al.  Some Signals and Rules for Taking Speaking Turns in Conversations , 1972 .

[6]  S. Maynard Conversation management in contrast: Listener response in Japanese and American English , 1990 .

[7]  O. Watson,et al.  Proxemic behavior : a cross-cultural study , 1970 .

[8]  Jean Carletta,et al.  A shallow model of backchannel continuers in spoken dialogue , 2003 .

[9]  Nigel G. Ward,et al.  Prosodic features which cue back-channel responses in English and Japanese , 2000 .

[10]  Nigel Ward,et al.  A prosodic feature that invites back-channels in Egyptian Arabic , 2007 .

[11]  Philippe Blache,et al.  Backchannels revisited from a multimodal perspective , 2007, AVSP.

[12]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[13]  D. Traum,et al.  The UTEP-ICT Cross-Cultural Multiparty Multimodal Dialog Corpus , 2010 .

[14]  John H. L. Hansen,et al.  University of Colorado Dialogue Systems for Travel and Navigation , 2001, HLT.

[15]  Andreas Stolcke,et al.  Can Prosody Aid the Automatic Processing of Multi-Party Meetings? Evidence from Predicting Punctuation, Disfluencies, and Overlapping Speech , 2003 .

[16]  Julia Hirschberg,et al.  Backchannel-inviting cues in task-oriented dialogue , 2009, INTERSPEECH.

[17]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..