Investigating speaker gaze and pointing behaviour in human-computer interaction with the mint.tools collection

Can speaker gaze and speaker arm movements be used as a practical information source for naturalistic conversational human‐computer interfaces? To investigate this question, we recorded (with eye tracking and motion capture) a corpus of interactions with a (wizarded) system. In this paper, we describe the recording, analysis infrastructure that we built for such studies, and analysis we performed on these data. We find that with some initial calibration, a “minimally invasive”, stationary camera-based setting provides data of sufficient quality to support interaction.

[1]  Thies Pfeiffer,et al.  Understanding multimodal deixis with gaze and gesture in conversational interfaces , 2011 .

[2]  David Schlangen,et al.  MINT.tools: tools and adaptors supporting acquisition, annotation and analysis of multimodal corpora , 2013, INTERSPEECH.

[3]  Masafumi Nishida,et al.  Eye-gaze experiments for conversation monitoring , 2009, IUCS.

[4]  Matthew W. Crocker,et al.  Enhancing Referential Success by Tracking Hearer Gaze , 2012, SIGDIAL Conference.

[5]  Marcus Nyström,et al.  An adaptive algorithm for fixation, saccade, and glissade detection in eyetracking data , 2010, Behavior research methods.

[6]  David Schlangen,et al.  Interpreting Situated Dialogue Utterances: an Update Model that Uses Speech, Gaze, and Gesture Information , 2013, SIGDIAL Conference.

[7]  P. Boersma Praat : doing phonetics by computer (version 4.4.24) , 2006 .

[8]  Hedda Lausberg,et al.  Methods in Gesture Research: , 2009 .

[9]  Toyoaki Nishida,et al.  Analysis environment of conversational structure with nonverbal multimodal data , 2010, ICMI-MLMI '10.

[10]  Petra Wagner,et al.  Evaluating a minimally invasive laboratory architecture for recording multimodal conversational data , 2012, Interspeech 2012.

[11]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[12]  Han sL oetjes Coding gestural behavior with the NEUROGES-ELAN system , 2009 .

[13]  David Schlangen,et al.  Towards Reducing and Managing Uncertainty in Spoken Dialogue Systems , 2007 .