Head movements, facial expressions and feedback in conversations: empirical evidence from Danish multimodal data

This article deals with multimodal feedback in two Danish multimodal corpora, i.e., a collection of map-task dialogues and a corpus of free conversations in first encounters between pairs of subjects. Machine learning techniques are applied to both sets of data to investigate various relations between the non-verbal behaviour—more specifically head movements and facial expressions—and speech with regard to the expression of feedback. In the map-task data, we study the extent to which the dialogue act type of linguistic feedback expressions can be classified automatically based on the non-verbal features. In the conversational data, on the other hand, non-verbal and speech features are used together to distinguish feedback from other multimodal behaviours. The results of the two sets of experiments indicate in general that head movements, and to a lesser extent facial expressions, are important indicators of feedback, and that gestures and speech disambiguate each other in the machine learning process.

[1]  Nina Grønnum DanPASS - A Danish Phonetically Annotated Spontaneous Speech Corpus , 2006, LREC.

[2]  Costanza Navarretta,et al.  Feedback and gestural behaviour in a conversational corpus of Danish , 2011 .

[3]  Carlos Busso,et al.  Interrelation Between Speech and Facial Gestures in Emotional Utterances: A Single Subject Study , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Joakim Nivre,et al.  On the Semantics and Pragmatics of Linguistic Feedback , 1992, J. Semant..

[5]  Christian Husodo-Schulz,et al.  Exploring Features and Classifiers for Dialogue Act Segmentation , 2008, MLMI.

[6]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[7]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[8]  Trevor Darrell,et al.  Head gestures for perceptual interfaces: The role of context in improving recognition , 2007, Artif. Intell..

[9]  A. W. Siegman,et al.  Studies in dyadic communication. , 1972 .

[10]  Patrizia Paggio,et al.  Learning to classify the feedback function of head movements in a Danish corpus of first encounters , 2011 .

[11]  John T. Cacioppo,et al.  Social psychophysiology : a sourcebook , 1983 .

[12]  Dirk Heylen,et al.  On the Contextual Analysis of Agreement Scores , 2009, Multimodal Corpora.

[13]  Costanza Navarretta,et al.  Distinguishing the Communicative Functions of Gestures , 2008, MLMI.

[14]  Louis-Philippe Morency,et al.  A probabilistic multimodal approach for predicting listener backchannels , 2009, Autonomous Agents and Multi-Agent Systems.

[15]  Jungmin Son,et al.  Categories and Subject Descriptors , 2012 .

[16]  P. Boersma Praat : doing phonetics by computer (version 5.1.05) , 2009 .

[17]  Trevor Darrell,et al.  Contextual recognition of head gestures , 2005, ICMI '05.

[18]  Costanza Navarretta,et al.  The NOMCO Multimodal Nordic Resource - Goals and Characteristics , 2010, LREC.

[19]  Patrizia Paggio Information Structure and Pauses in a Corpus of Spoken Danish , 2006, EACL.

[20]  U. Hadar,et al.  Head movement during listening turns in conversation , 1985 .

[21]  Steve Renals,et al.  Detecting Action Items in Meetings , 2008, MLMI.

[22]  Loredana Cerrato,et al.  Investigating Communicative Feedback Phenomena across Languages and Modalities , 2007 .

[23]  Evelyn Z. McClave Linguistic functions of head movements in the context of speech , 2000 .

[24]  Liangxiao Jiang,et al.  Hidden Naive Bayes , 2005, AAAI.

[25]  S. Maynard Interactional functions of a nonverbal sign Head movement in japanese dyadic casual conversation , 1987 .

[26]  T. Kobayashi,et al.  A conversation robot using head gesture recognition as para-linguistic information , 2004, RO-MAN 2004. 13th IEEE International Workshop on Robot and Human Interactive Communication (IEEE Catalog No.04TH8759).

[27]  Jia Lu Unimodal and Multimodal Feedback In Chinese and Swedish Monocultural and Intercultural Interactions ( a pilot study ) , 2011 .

[28]  A. Kendon Some Relationships Between Body Motion and Speech , 1972 .

[29]  Patrizia Paggio,et al.  Annotating Information Structure in a Corpus of Spoken Danish , 2006, LREC.

[30]  Michael Kipp,et al.  Gesture generation by imitation: from human behavior to computer character animation , 2005 .

[31]  Costanza Navarretta,et al.  The MUMIN coding scheme for the annotation of feedback, turn management and sequencing phenomena , 2007, Lang. Resour. Evaluation.

[32]  Walt Wolfram,et al.  Extending Urban Sociolinguistic Perspective@@@The Copenhagen Study in Urban Sociolinguistics , 1995 .

[33]  Costanza Navarretta,et al.  Classification of Feedback Expressions in Multimodal Data , 2010, ACL.

[34]  Patrizia Paggio,et al.  Information structure and communicative functions in spoken and multimodal data , 2010 .

[35]  S. Duncan,et al.  Some Signals and Rules for Taking Speaking Turns in Conversations , 1972 .

[36]  V. Yngve On getting a word in edgewise , 1970 .