Dynamic Bayesian network based interest estimation for visual attentive presentation agents

In this paper, we report on an interactive system and the results ofa formal user study that was carried out with the aim of comparing two approaches to estimating users' interest in a multimodal presentation based on their eye gaze. The scenario consists of a virtual showroom where two 3D agents present product items in an entertaining way, and adapt their performance according to users' (in)attentiveness. In order to infer users' attention and visual interest with regard to interface objects, our system analyzes eye movements in real-time. Interest detection algorithms used in previous research determine an object of interest based on the time that eye gaze dwells on that object. However, this kind of algorithm does not seem to be well suited for dynamic presentations where the goal is to assess the user's focus of attention with regard to a dynamically changing presentation. Here, the current context of the object of interest has to be considered, i.e., whether the visual object is part of (or contributes to) the current presentation content or not. Therefore, we propose to estimate the interest (or non-interest) of a user by means of dynamic Bayesian networks that may take into account the current context of the attention receiving object. In this way, the presentation agents can provide timely and appropriate response. The benefits of our approach will be demonstrated both theoretically and empirically.

[1]  Wolfgang Wahlster,et al.  Intelligent Interactive Entertainment Grand Challenges , 2006, IEEE Intelligent Systems.

[2]  Ted Selker,et al.  Visual Attentive Interfaces , 2004 .

[3]  Richard A. Bolt,et al.  A gaze-responsive self-disclosing display , 1990, CHI '90.

[4]  Anthony Jameson,et al.  An extension of the differential approach for Bayesian network inference to dynamic Bayesian networks , 2004, Int. J. Intell. Syst..

[5]  Herbert H. Clark,et al.  Grounding in communication , 1991, Perspectives on socially shared cognition.

[6]  Mark Steedman,et al.  APML, a Markup Language for Believable Behavior Generation , 2004, Life-like characters.

[7]  Anton Nijholt,et al.  Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes , 2001, CHI.

[8]  Shumin Zhai,et al.  Conversing with the user based on eye-gaze patterns , 2005, CHI.

[9]  Yukiko I. Nakano,et al.  Towards a Model of Face-to-Face Grounding , 2003, ACL.

[10]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[11]  Mitsuru Ishizuka,et al.  Life-like characters - tools, affective functions, and applications , 2004, Life-like characters.

[12]  Anthony Jameson,et al.  An extension of the differential approach for Bayesian network inference to dynamic Bayesian networks: Research Articles , 2004 .

[13]  Mitsuru Ishizuka,et al.  MPML3D: A Reactive Framework for the Multimodal Presentation Markup Language , 2006, IVA.

[14]  Mel Slater,et al.  The impact of eye gaze on communication using humanoid avatars , 2001, CHI.

[15]  David Beymer,et al.  What ’ s in the EYES for Attentive Input , 2003 .

[16]  Mitsuru Ishizuka,et al.  Attentive Presentation Agents , 2007, IVA.

[17]  Mitsuru Ishizuka,et al.  Interest estimation based on dynamic bayesian networks for visual attentive presentation agents , 2007, ICMI '07.

[18]  Candace L. Sidner,et al.  Where to look: a study of human-robot engagement , 2004, IUI '04.