Embodied Communicative Activity in Cooperative Conversational Interactions - studies in Visual Interaction Management

Non-verbal communication is important in order to maintain fluency of communication. Gestures, facial expressions, and eye-gazing function as efficient means to convey feedback and provide subtle cues to control and organise conversations. In this article, we discuss the relation of verbal and non-verbal feedback from the point of view of communicative activity, and focus especially on hand gestures and body movement in the coordination of the interaction. This is called “Visual Interaction Management”. Combining the top-down approach, i.e. manual annotation and analysis of the data, with the bottom-up analysis of the speech and visual signals, we can visualize the speakers’ speech and gesture activity, and align this with those gestures and body movements that the interlocutors interpret as communicatively important. As the method for the bottom-up analysis, we use Echo State Networks, an architecture for recurrent neural networks, and use it in the recognition analysis of communicative behaviour patterns.

[1]  Nigel G. Ward,et al.  Prosodic features which cue back-channel responses in English and Japanese , 2000 .

[2]  Garrison W. Cottrell,et al.  2007 Special Issue: Learning grammatical structure with Echo State Networks , 2007 .

[3]  Kristiina Jokinen,et al.  Stand-up Gestures — Annotation for Communication Management , 2009 .

[4]  Janet Beavin Bavelas,et al.  Linguistic influences on gesture’s form , 2005 .

[5]  Hung-Hsuan Huang,et al.  From observation to simulation: generating culture-specific behavior for interactive systems , 2009, AI & SOCIETY.

[6]  Thomas Rist,et al.  A flexible platform for building applications with life-like characters , 2003, IUI '03.

[7]  Eric Horvitz,et al.  Models of attention in computing and communication , 2003, Commun. ACM.

[8]  A. Ichikawa,et al.  An Analysis of Turn-Taking and Backchannels Based on Prosodic and Syntactic Features in Japanese Map Task Dialogs , 1998, Language and speech.

[9]  Toyoaki Nishida,et al.  Attentional Behaviors as Nonverbal Communicative Signals in Situated Interactions with Conversational Agents , 2007 .

[10]  Nick Campbell,et al.  Comparing measures of synchrony and alignment in dialogue speech timing with respect to turn-taking activity , 2010, INTERSPEECH.

[11]  Costanza Navarretta,et al.  Distinguishing the communicative functions of gestures An experiment with annotated gesture data , 2008 .

[12]  Kristiina Jokinen,et al.  Pointing Gestures and Synchronous Communication Management , 2009, COST 2102 Training School.

[13]  S. Levinson Activity types and language , 1979, Linguistics.

[14]  Günther Palm,et al.  Real-Time Emotion Recognition from Speech Using Echo State Networks , 2008, ANNPR.

[15]  Michael Kipp,et al.  Gesture generation by imitation: from human behavior to computer character animation , 2005 .

[16]  Costanza Navarretta,et al.  The MUMIN coding scheme for the annotation of feedback, turn management and sequencing phenomena , 2007, Lang. Resour. Evaluation.

[17]  André Frank Krause,et al.  Direct Control of an Active Tactile Sensor Using Echo State Networks , 2009, Human Centered Robot Systems, Cognition, Interaction, Technology.

[18]  Bernard Rimé,et al.  Fundamentals of nonverbal behavior , 1991 .

[19]  Jean Carletta,et al.  Nonverbal behaviours improving a simulation of small group discussion , 2003 .

[20]  Yasuhiro Katagiri,et al.  Interactional Alignment in Collaborative Problem Solving Dialogues , 2005 .

[21]  Thomas Rist,et al.  Employing AI Methods to Control the Behavior of Animated Interface Agents , 1999, Appl. Artif. Intell..

[22]  Radoslaw Niewiadomski,et al.  Multimodal Complex Emotions: Gesture Expressivity and Blended Facial Expressions , 2006, Int. J. Humanoid Robotics.

[23]  Herbert H. Clark,et al.  Contributing to Discourse , 1989, Cogn. Sci..

[24]  Günther Palm,et al.  Multimodal Laughter Detection in Natural Discourses , 2009, Human Centered Robot Systems, Cognition, Interaction, Technology.

[25]  J. Allwood BODILY COMMUNICATION DIMENSIONS OF EXPRESSION AND CONTENT , 2002 .

[26]  Harald Haas,et al.  Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[27]  Michael C. Corballis,et al.  From Hand to Mouth: The Origins of Language , 2002 .

[28]  Catherine Pelachaud,et al.  Embodied contextual agent in information delivering application , 2002, AAMAS '02.

[29]  Marianne Gullberg,et al.  A helping hand? Gestures, L2 learners, and grammar , 2008 .

[30]  A. Kendon Gesture: Visible Action as Utterance , 2004 .

[31]  Sven Behnke,et al.  Fritz - A Humanoid Communication Robot , 2007, RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication.

[32]  M. Argyle,et al.  Gaze and Mutual Gaze , 1994, British Journal of Psychiatry.

[33]  Costanza Navarretta,et al.  Distinguishing the Communicative Functions of Gestures , 2008, MLMI.

[34]  Dirk Heylen,et al.  Facial Signs of Affect During Tutoring Sessions , 2005, ACII.

[35]  Justine Cassell,et al.  BEAT: the Behavior Expression Animation Toolkit , 2001, Life-like characters.

[36]  Roddy Cowie,et al.  Emotional speech: Towards a new generation of databases , 2003, Speech Commun..

[37]  Herbert Jaeger,et al.  A tutorial on training recurrent neural networks , covering BPPT , RTRL , EKF and the " echo state network " approach - Semantic Scholar , 2005 .

[38]  Yukiko I. Nakano,et al.  Non-Verbal Cues for Discourse Structure , 2022 .

[39]  M. Pickering,et al.  Towards a mechanistic theory of dialog , 2004 .

[40]  Kristiina Jokinen,et al.  Constructive Dialogue Modelling - Speech Interaction and Rational Agents , 2009, Wiley series in agent technology.