Learning Cooperative Personalized Policies from Gaze Data

An ideal Mixed Reality (MR) system would only present virtual information (e.g., a label) when it is useful to the person. However, deciding when a label is useful is challenging: it depends on a variety of factors, including the current task, previous knowledge, context, etc. In this paper, we propose a Reinforcement Learning (RL) method to learn when to show or hide an object's label given eye movement data. We demonstrate the capabilities of this approach by showing that an intelligent agent can learn cooperative policies that better support users in a visual search task than manually designed heuristics. Furthermore, we show the applicability of our approach to more realistic environments and use cases (e.g., grocery shopping). By posing MR object labeling as a model-free RL problem, we can learn policies implicitly by observing users' behavior without requiring a visual search model or data annotation.

[1]  Dieter Schmalstieg,et al.  Adaptive information density for augmented reality displays , 2016, 2016 IEEE Virtual Reality (VR).

[2]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[3]  Jitendra Malik,et al.  SFV , 2018, ACM Trans. Graph..

[4]  Feng Liu,et al.  Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modeling , 2018, ArXiv.

[5]  Peter Stone,et al.  DJ-MC: A Reinforcement-Learning Agent for Music Playlist Recommendation , 2014, AAMAS.

[6]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[7]  Milica Gasic,et al.  Gaussian Processes for POMDP-Based Dialogue Manager Optimization , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[8]  Sebastian Thrun,et al.  Apprenticeship learning for motion planning with application to parking lot navigation , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Pieter Abbeel,et al.  Apprenticeship learning for helicopter control , 2009, CACM.

[10]  Ed H. Chi,et al.  Top-K Off-Policy Correction for a REINFORCE Recommender System , 2018, WSDM.

[11]  Zoran Popovic,et al.  Motion fields for interactive character locomotion , 2010, CACM.

[12]  Dieter Schmalstieg,et al.  Hedgehog labeling: View management techniques for external labels in 3D space , 2014, 2014 IEEE Virtual Reality (VR).

[13]  Sergey Levine,et al.  DeepMimic , 2018, ACM Trans. Graph..

[14]  Mihran Tuceryan,et al.  Automatic determination of text readability over textured backgrounds for augmented reality systems , 2004, Third IEEE and ACM International Symposium on Mixed and Augmented Reality.

[15]  Jie Zhang,et al.  Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing , 2018, NeurIPS.

[16]  Tom Drummond,et al.  Real-Time Video Annotations for Augmented Reality , 2005, ISVC.

[17]  Anind K. Dey,et al.  Modeling and Understanding Human Routine Behavior , 2016, CHI.

[18]  Nancy S. Pollard,et al.  To appear in the ACM SIGGRAPH conference proceedings Responsive Characters from Motion Fragments , 2022 .

[19]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[20]  Matthias Zwicker,et al.  Real-time planning for parameterized human motion , 2008, SCA '08.

[21]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[22]  Claus B. Madsen,et al.  Temporal Coherence Strategies for Augmented Reality Labeling , 2016, IEEE Transactions on Visualization and Computer Graphics.

[23]  Sham M. Kakade,et al.  Towards Generalization and Simplicity in Continuous Control , 2017, NIPS.

[24]  Ronald Azuma,et al.  Evaluating label placement for augmented reality view management , 2003, The Second IEEE and ACM International Symposium on Mixed and Augmented Reality, 2003. Proceedings..

[25]  Dieter Schmalstieg,et al.  Image-driven view management for augmented reality browsers , 2012, 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[26]  Stefan Ultes,et al.  Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management , 2017, SIGDIAL Conference.

[27]  Z. Popovic,et al.  Learning behavior styles with inverse reinforcement learning , 2010, ACM Trans. Graph..

[28]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[29]  Nando de Freitas,et al.  Playing hard exploration games by watching YouTube , 2018, NeurIPS.

[30]  J. Wolfe Guidance of Visual Search by Preattentive Information , 2005 .

[31]  Steven K. Feiner,et al.  Information filtering for mobile augmented reality , 2000, Proceedings IEEE and ACM International Symposium on Augmented Reality (ISAR 2000).

[32]  Steven K. Feiner,et al.  View management for virtual and augmented reality , 2001, UIST '01.

[33]  Ralf Engbert,et al.  Microsaccades uncover the orientation of covert attention , 2003, Vision Research.