A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction

In this paper we present a corpus of multiparty situated interaction where participants collaborated on moving virtual objects on a large touch screen. A moderator facilitated the discussion and di ...

[1]  Richard A. Bolt,et al.  “Put-that-there”: Voice and gesture at the graphics interface , 1980, SIGGRAPH '80.

[2]  Terry Winograd,et al.  Understanding natural language , 1974 .

[3]  Jonas Beskow,et al.  A Multi-party Multi-modal Dataset for Focus of Visual Attention in Human-human and Human-robot Interaction , 2016, LREC.

[4]  Matthias Scheutz,et al.  The reliability of non-verbal cues for situated reference resolution and their interplay with language: implications for human robot interaction , 2017, ICMI.

[5]  Ali Borji,et al.  State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  L. R. Goldberg THE DEVELOPMENT OF MARKERS FOR THE BIG-FIVE FACTOR STRUCTURE , 1992 .

[7]  Gabriel Skantze,et al.  A General, Abstract Model of Incremental Dialogue Processing , 2011 .

[8]  G. C. Sziklai Some studies in the speed of visual perception , 1957 .

[9]  Petros Maragos,et al.  Video event detection and summarization using audio, visual and text saliency , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  M. Crocker,et al.  Investigating joint attention mechanisms through spoken human–robot interaction , 2011, Cognition.

[11]  Jean Carletta,et al.  Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus , 2007, Lang. Resour. Evaluation.

[12]  Patrick Gebhard,et al.  Exploring a Model of Gaze for Grounding in Multimodal HRI , 2014, ICMI.

[13]  Eric Horvitz,et al.  Facilitating multiparty dialog with gaze, gesture, and speech , 2010, ICMI-MLMI '10.

[14]  B. Scassellati,et al.  Social eye gaze in human-robot interaction , 2017, J. Hum. Robot Interact..

[15]  David Schlangen,et al.  Interpreting Situated Dialogue Utterances: an Update Model that Uses Speech, Gaze, and Gesture Information , 2013, SIGDIAL Conference.

[16]  Jean-Marc Odobez,et al.  Recognizing the Visual Focus of Attention for Human Robot Interaction , 2012, HBU.

[17]  Jean-Marc Odobez,et al.  Who Will Get the Grant?: A Multimodal Corpus for the Analysis of Conversational Behaviours in Group Interviews , 2014, UM3I '14.

[18]  Stefan Kopp,et al.  Data-based analysis of speech and gesture: the Bielefeld Speech and Gesture Alignment corpus (SaGA) and its applications , 2013, Journal on Multimodal User Interfaces.

[19]  Petra Wagner,et al.  D64: a corpus of richly recorded conversational interaction , 2013, Journal on Multimodal User Interfaces.

[20]  Hayley Hung,et al.  The idiap wolf corpus: exploring group behaviour in a competitive role-playing game , 2010, ACM Multimedia.

[21]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[22]  Eric Horvitz,et al.  Models for Multiparty Engagement in Open-World Dialog , 2009, SIGDIAL Conference.

[23]  Joyce Yue Chai,et al.  Fusing Eye Gaze with Speech Recognition Hypotheses to Resolve Exophoric References in Situated Dialogue , 2010, EMNLP.

[24]  John K. Tsotsos,et al.  Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[25]  Joyce Yue Chai,et al.  What's in a gaze?: the role of eye-gaze in reference resolution in multimodal conversational interfaces , 2008, IUI '08.

[26]  Laurianne Georgeton,et al.  Spontaneous speech production by dysarthric and healthy speakers: Temporal organisation and speaking rate , 2015, ICPhS.

[27]  Thies Pfeiffer,et al.  Pointing and reference reconsidered , 2015 .

[28]  Gabriel Skantze,et al.  Exploring Turn-taking Cues in Multi-party Human-robot Discussions about Objects , 2015, ICMI.

[29]  K. Rayner Eye Movements and Cognitive Processes in Reading, Visual Search, and Scene Perception , 1995 .

[30]  Gabriel Skantze,et al.  Data-driven models for timing feedback responses in a Map Task dialogue system , 2014, Comput. Speech Lang..