The ALICO corpus: analysing the active listener

The Active Listening Corpus (ALICO) is a multimodal data set of spontaneous dyadic conversations in German with diverse speech and gestural annotations of both dialogue partners. The annotations consist of short feedback expression transcriptions with corresponding communicative function interpretations as well as segmentations of interpausal units, words, rhythmic prominence intervals and vowel-to-vowel intervals. Additionally, ALICO contains head gesture annotations of both interlocutors. The corpus contributes to research on spontaneous human–human interaction, on functional relations between modalities, and timing variability in dialogue. It also provides data that differentiates between distracted and attentive listeners. We describe the main characteristics of the corpus and briefly present the most important results obtained from analyses in recent years.

[1]  Dirk Heylen,et al.  The MultiLis Corpus - Dealing with Individual Differences in Nonverbal Listening Behavior , 2010, COST 2102 Training School.

[2]  Dirk Heylen,et al.  Head Gestures, Gaze and the Principles of Conversational Structure , 2006, Int. J. Humanoid Robotics.

[3]  Donald W. Fiske,et al.  Face-to-face interaction: Research, methods, and theory , 1977 .

[4]  Evelyn Z. McClave Linguistic functions of head movements in the context of speech , 2000 .

[5]  Stefan Kopp,et al.  Prosodic Characteristics of Feedback Expressions in Distracted and Non-distracted Listeners , 2012 .

[6]  Mattias Heldner,et al.  Very short utterances in conversation , 2010 .

[7]  John Kane,et al.  Identifying Regions of Non-Modal Phonation Using Features of the Wavelet Transform , 2011, INTERSPEECH.

[8]  Zofia Malisz,et al.  Listener head gestures and verbal feedback expressions in a distraction task , 2012, Interspeech 2012.

[9]  Petra Wagner,et al.  Exploring annotation of head gesture forms in spontaneous human interaction. , 2013 .

[10]  S. Garrod,et al.  Saying what you mean in dialogue: A study in conceptual and semantic co-ordination , 1987, Cognition.

[11]  C. Goodwin Conversational Organization: Interaction Between Speakers and Hearers , 1981 .

[12]  Costanza Navarretta,et al.  The MUMIN coding scheme for the annotation of feedback, turn management and sequencing phenomena , 2007, Lang. Resour. Evaluation.

[13]  Stefan Kopp,et al.  Gesture and speech in interaction: An overview , 2014, Speech Commun..

[14]  Anna K. Kuhlen,et al.  Anticipating Distracted Addressees: How Speakers' Expectations and Addressees' Feedback Influence Storytelling , 2010 .

[15]  Harry Bunt,et al.  Entailed feedback : Evidence from a ranking experiment , 2010 .

[16]  U. Hadar,et al.  Head movement during listening turns in conversation , 1985 .

[17]  M. Alibali,et al.  Transitions in concept acquisition: using the hand to read the mind. , 1993, Psychological review.

[18]  Petra Wagner,et al.  Interaction phonology – A temporal co-ordination component enabling representational alignment within a model 
of communication , 2013 .

[19]  V. Yngve On getting a word in edgewise , 1970 .

[20]  Palmer Morrel-Samuels,et al.  Word familiarity predicts temporal asynchrony of hand gestures and speech. , 1992 .

[21]  Herbert H. Clark,et al.  Contributing to Discourse , 1989, Cogn. Sci..

[22]  Maurizio Mancini,et al.  Implementing Expressive Gesture Synthesis for Embodied Conversational Agents , 2005, Gesture Workshop.

[23]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[24]  Rod Gardner,et al.  When Listeners Talk: Response Tokens and Listener Stance , 2001 .

[25]  Dirk Heylen,et al.  A Multimodal Analysis of Vocal and Visual Backchannels in Spontaneous Dialogs , 2011, INTERSPEECH.

[26]  Stefan Kopp,et al.  Verbal or Visual? How Information is Distributed across Speech and Gesture in Spatial Dialog , 2006 .

[27]  S. Kopp,et al.  Using a Bayesian model of the listener to unveil the dialogue information state , 2012 .

[28]  Joakim Nivre,et al.  On the Semantics and Pragmatics of Linguistic Feedback , 1992, J. Semant..

[29]  Stefan Kopp,et al.  ALICO: a multimodal corpus for the study of active listening , 2014, LREC.

[30]  Julia Hirschberg,et al.  Pragmatic aspects of temporal accommodation in turn-taking , 2011, Journal of Pragmatics.

[31]  Harry Bunt,et al.  Multifunctionality and Multidimensional Dialogue act annotation , 2007 .

[32]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[33]  E. Schegloff Discourse as an interactional achievement : Some uses of "Uh huh" and other things that come between sentences , 1982 .

[34]  Jean Carletta,et al.  Squibs: Reliability Measurement without Limits , 2008, CL.

[35]  A. Dittmann,et al.  Relationship between vocalizations and head nods as listener responses. , 1968, Journal of personality and social psychology.

[36]  Marcin Wlodarczak,et al.  TextGridTools: A TextGrid Processing and Analysis Toolkit for Python , 2013 .

[37]  Laura Vincze,et al.  Types of Nods. The Polysemy of a Social Signal , 2010, LREC.

[38]  Stefan Kopp,et al.  'Are You Sure You're Paying Attention?' - 'Uh-Huh' Communicating Understanding as a Marker of Attentiveness , 2011, INTERSPEECH.

[39]  Edward Gibson,et al.  Inter-transcriber reliability for two systems of prosodic annotation: ToBI (Tones and Break Indices) and RaP (Rhythm and Pitch) , 2012 .

[40]  Loredana Cerrato,et al.  Investigating Communicative Feedback Phenomena across Languages and Modalities , 2007 .

[41]  C. Pelachaud,et al.  Emotion-Oriented Systems: The Humaine Handbook , 2011 .

[42]  S. Nobe Language and Gesture: Where do most spontaneous representational gestures actually occur with respect to speech? , 2000 .

[43]  Frank A. Pintar,et al.  Physical properties of the human head: mass, center of gravity and moment of inertia. , 2009, Journal of biomechanics.

[44]  Elisabetta Bevacqua,et al.  A Model of Attention and Interest Using Gaze Behavior , 2005, IVA.

[45]  Petra Wagner,et al.  Timing and entrainment of multimodal backchanneling behavior for an embodied conversational agent , 2013, ICMI '13.

[46]  C. Pelachaud,et al.  Generating Listening Behaviour , 2011 .

[47]  Petra Wagner,et al.  Evaluating a minimally invasive laboratory architecture for recording multimodal conversational data , 2012, Interspeech 2012.

[48]  Peter Wittenburg,et al.  ELAN: a Professional Framework for Multimodality Research , 2006, LREC.

[49]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[50]  E. Schegloff Structures of Social Action: On some gestures' relation to talk , 1985 .

[51]  Nigel G. Ward,et al.  Prosodic features which cue back-channel responses in English and Japanese , 2000 .

[52]  Raquel Santana Santos Incursões em torno do ritmo da fala , 2007 .

[53]  Petra Wagner,et al.  D64: a corpus of richly recorded conversational interaction , 2013, Journal on Multimodal User Interfaces.

[54]  Shira Mitchell,et al.  Classification of discourse functions of affirmative words in spoken dialogue , 2007, INTERSPEECH.

[55]  Jan Gorisch,et al.  Annotation and Classification of French Feedback Communicative Functions , 2015, PACLIC.

[56]  J. Bavelas,et al.  Listeners as co-narrators. , 2000, Journal of personality and social psychology.

[57]  Mattias Heldner,et al.  Communicative needs and respiratory constraints , 2015, INTERSPEECH.

[58]  A. Ichikawa,et al.  An Analysis of Turn-Taking and Backchannels Based on Prosodic and Syntactic Features in Japanese Map Task Dialogs , 1998, Language and speech.

[59]  Hiroshi Ishiguro,et al.  Analysis of relationship between head motion events and speech in dialogue conversations , 2014, Speech Communication.

[60]  Candace L. Sidner,et al.  Where to look: a study of human-robot engagement , 2004, IUI '04.

[61]  Mattias Heldner,et al.  Backchannel relevance spaces , 2013 .

[62]  A. Kendon Some functions of gaze-direction in social interaction. , 1967, Acta psychologica.

[63]  J. D. Ruiter The production of gesture and speech , 2000 .

[64]  Florian Schiel,et al.  Signal processing via web services: The use case WebMAUS , 2012 .

[65]  Harry Bunt,et al.  Evaluating Dialogue Act Tagging with Naive and Expert Annotators , 2008, LREC.

[66]  Stefan Kopp,et al.  Modeling Embodied Feedback with Virtual Humans , 2006, ZiF Workshop.