Gaze Regularized Imitation Learning: Learning Continuous Control from Human Gaze

Approaches for teaching learning agents via human demonstrations have been widely studied and successfully applied to multiple domains. However, the majority of imitation learning work utilizes only behavioral information from the demonstrator, i.e. which actions were taken, and ignores other useful information. In particular, eye gaze information can give valuable insight towards where the demonstrator is allocating visual attention, and holds the potential to improve agent performance and generalization. In this work, we propose Gaze Regularized Imitation Learning (GRIL), a novel context-aware, imitation learning architecture that learns concurrently from both human demonstrations and eye gaze to solve tasks where visual attention provides important context. We apply GRIL to a visual navigation task, in which an unmanned quadrotor is trained to search for and navigate to a target vehicle in a photo-realistic simulated environment. We show that GRIL outperforms several state-of-the-art gaze-based imitation learning algorithms, simultane-ously learns to predict human visual attention, and generalizes to scenarios not present in the training data. Supplemental videos can be found at project https://sites.google.com/ view/gaze-regularized-il/ and code at https://github.com/ravikt/gril

[1]  Y. Kuniyoshi,et al.  Memory-based gaze prediction in deep imitation learning for robot manipulation , 2022, 2022 International Conference on Robotics and Automation (ICRA).

[2]  D. Scaramuzza,et al.  Visual attention prediction improves performance of autonomous drone racing agents , 2022, PloS one.

[3]  S. Ermon,et al.  IQ-Learn: Inverse soft-Q Learning for Imitation , 2021, NeurIPS.

[4]  Yasuo Kuniyoshi,et al.  Gaze-Based Dual Resolution Deep Imitation Learning for High-Precision Dexterous Robot Manipulation , 2021, IEEE Robotics and Automation Letters.

[5]  Ehsan T. Esfahani,et al.  Selective Eye-gaze Augmentation To Enhance Imitation Learning In Atari Games , 2020, Neural Computing and Applications.

[6]  M. Hayhoe,et al.  Machine versus Human Attention in Deep Reinforcement Learning Tasks , 2020, NeurIPS.

[7]  Holger Voos,et al.  When I Look into Your Eyes: A Survey on Computer Vision Contributions for Human Gaze Estimation and Tracking , 2020, Sensors.

[8]  Congcong Liu,et al.  Using Eye Gaze to Enhance Generalization of Imitation Networks to Unseen Environments , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[9]  M. Schaar,et al.  Strictly Batch Imitation Learning by Energy-based Distribution Matching , 2020, NeurIPS.

[10]  Yasuo Kuniyoshi,et al.  Using Human Gaze to Improve Robustness Against Irrelevant Objects in Robot Manipulation Tasks , 2020, IEEE Robotics and Automation Letters.

[11]  Scott Niekum,et al.  Efficiently Guiding Imitation Learning Algorithms with Human Gaze , 2020, ArXiv.

[12]  Ilya Kostrikov,et al.  Imitation Learning via Off-Policy Distribution Matching , 2019, ICLR.

[13]  Bikramjit Banerjee,et al.  Model-Free IRL Using Maximum Likelihood Estimation , 2019, AAAI.

[14]  Scott Niekum,et al.  Understanding Teacher Gaze Patterns for Robot Learning , 2019, CoRL.

[15]  Anca D. Dragan,et al.  SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards , 2019, ICLR.

[16]  Eder Santana,et al.  Exploring the Limitations of Behavior Cloning for Autonomous Driving , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Ming Liu,et al.  Gaze Training by Modulated Dropout Improves Imitation Learning , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[18]  Prabhat Nagarajan,et al.  Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations , 2019, ICML.

[19]  David Whitney,et al.  Periphery-Fovea Multi-Resolution Driving Model Guided by Human Attention , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[20]  D. Ballard,et al.  Atari-HEAD: Atari Human Eye-Tracking and Demonstration Dataset , 2019, AAAI.

[21]  Jitendra Malik,et al.  On Evaluation of Embodied Navigation Agents , 2018, ArXiv.

[22]  Luxin Zhang,et al.  AGIL: Learning Attention from Human for Visuomotor Tasks , 2018, ECCV.

[23]  P. Stone,et al.  Behavioral Cloning from Observation , 2018, IJCAI.

[24]  Pieter Abbeel,et al.  An Algorithmic Perspective on Imitation Learning , 2018, Found. Trends Robotics.

[25]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Rouhollah Rahmatizadeh,et al.  Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-to-End Learning from Demonstration , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[28]  Ashish Kapoor,et al.  AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles , 2017, FSR.

[29]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  James M. Rehg,et al.  Learning to Predict Gaze in Egocentric Video , 2013, 2013 IEEE International Conference on Computer Vision.

[32]  M. Trivedi,et al.  Head and eye gaze dynamics during visual attention shifts in complex environments. , 2012, Journal of vision.

[33]  Alexander C. Schütz,et al.  Eye movements and perception: a selective review. , 2011, Journal of vision.

[34]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[35]  David Silver,et al.  Learning from Demonstration for Autonomous Navigation in Complex Unstructured Terrain , 2010, Int. J. Robotics Res..

[36]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[37]  Jian-Gang Wang,et al.  Study on eye gaze estimation , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[38]  Joseph H. Goldberg,et al.  Identifying fixations and saccades in eye-tracking protocols , 2000, ETRA.

[39]  Ivan Bratko,et al.  Behavioural Cloning: Phenomena, Results and Problems , 1995 .

[40]  Linden J. Ball,et al.  Eye tracking in HCI and usability research. , 2006 .