SalGaze: Personalizing Gaze Estimation using Visual Saliency

Traditional gaze estimation methods typically require explicit user calibration to achieve high accuracy. This process is cumbersome and recalibration is often required when there are changes in factors such as illumination and pose. To address this challenge, we introduce SalGaze, a framework that utilizes saliency information in the visual content to transparently adapt the gaze estimation algorithm to the user without explicit user calibration. We design an algorithm to transform a saliency map into a differentiable loss map that can be used for the optimization of CNN-based models. SalGaze is also able to greatly augment standard point calibration data with implicit video saliency calibration data using a unified framework. We show accuracy improvements over 24% using our technique on existing methods.

[1]  Jean-Marc Odobez,et al.  EYEDIAP: a database for the development and evaluation of gaze estimation algorithms from RGB and RGB-D cameras , 2014, ETRA.

[2]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[3]  Yoichi Sato,et al.  Appearance-Based Gaze Estimation Using Visual Saliency , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Jianbo Shi,et al.  Social saliency prediction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Matthias Bethge,et al.  DeepGaze II: Reading fixations from deep features trained on object recognition , 2016, ArXiv.

[6]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Gang Liu,et al.  Improving Few-Shot User-Specific Gaze Adaptation via Gaze Redirection Synthesis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Otmar Hilliges,et al.  Learning to find eye region landmarks for remote gaze estimation in unconstrained settings , 2018, ETRA.

[9]  Michel Wedel,et al.  Eye tracking for visual marketing , 2008 .

[10]  Narendra Ahuja,et al.  Appearance-based eye gaze estimation , 2002, Sixth IEEE Workshop on Applications of Computer Vision, 2002. (WACV 2002). Proceedings..

[11]  Dmitriy Vatolin,et al.  A semiautomatic saliency model and its application to video compression , 2017, 2017 13th IEEE International Conference on Intelligent Computer Communication and Processing (ICCP).

[12]  Noel E. O'Connor,et al.  SalGAN: Visual Saliency Prediction with Generative Adversarial Networks , 2017, ArXiv.

[13]  L. Itti,et al.  Quantifying center bias of observers in free viewing of dynamic natural scenes. , 2009, Journal of vision.

[14]  Steven K. Feiner,et al.  Gaze locking: passive eye contact detection for human-object interaction , 2013, UIST.

[15]  Rui Zhao,et al.  Generalizing Eye Tracking With Bayesian Adversarial Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Mario Fritz,et al.  Appearance-based gaze estimation in the wild , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Yiannis Demiris,et al.  RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments , 2018, ECCV.

[18]  Maria K. Eckstein,et al.  Beyond eye gaze: What else can eyetracking reveal about cognition and cognitive development? , 2016, Developmental Cognitive Neuroscience.

[19]  Yoichi Sato,et al.  Learning-by-Synthesis for Appearance-Based 3D Gaze Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Alexandre Proutière,et al.  Appearance-Based 3D Gaze Estimation with Personal Calibration , 2018, ArXiv.

[22]  Guillermo Sapiro,et al.  Synthesis-Based Low-Cost Gaze Analysis , 2016, HCI.

[23]  Antonio Torralba,et al.  Where are they looking? , 2015, NIPS.

[24]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[25]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[26]  J. Elison,et al.  Eye tracking young children with autism. , 2012, Journal of visualized experiments : JoVE.

[27]  James M. Rehg,et al.  Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency , 2018, ECCV.

[28]  Gang Liu,et al.  Deep Multitask Gaze Estimation with a Constrained Landmark-Gaze Model , 2018, ECCV Workshops.

[29]  Rita Cucchiara,et al.  Predicting Human Eye Fixations via an LSTM-Based Saliency Attentive Model , 2016, IEEE Transactions on Image Processing.

[30]  Antonio Torralba,et al.  Following Gaze in Video , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[31]  Wojciech Matusik,et al.  Eye Tracking for Everyone , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Erik Lindén,et al.  Learning to Personalize in Appearance-Based Gaze Tracking , 2018, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[33]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Mario Fritz,et al.  MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Ali Borji,et al.  Revisiting Video Saliency: A Large-Scale Benchmark and a New Model , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Davis E. King,et al.  Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..

[37]  Song-Chun Zhu,et al.  Inferring Shared Attention in Social Scene Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[39]  Peter Robinson,et al.  Learning an appearance-based gaze estimator from one million synthesised images , 2016, ETRA.

[40]  Takahiro Okabe,et al.  Inferring human gaze from appearance via adaptive linear regression , 2011, 2011 International Conference on Computer Vision.

[41]  Qiang Ji,et al.  Probabilistic gaze estimation without active personal calibration , 2011, CVPR 2011.

[42]  Qiong Huang,et al.  TabletGaze: dataset and analysis for unconstrained appearance-based gaze estimation in mobile tablets , 2017, Machine Vision and Applications.

[43]  Otmar Hilliges,et al.  Deep Pictorial Gaze Estimation , 2018, ECCV.

[44]  F. Volkmar,et al.  Visual fixation patterns during viewing of naturalistic social situations as predictors of social competence in individuals with autism. , 2002, Archives of general psychiatry.

[45]  Peter Robinson,et al.  Rendering of Eyes for Eye-Shape Registration and Gaze Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[46]  Andrea Palazzi,et al.  DR(eye)VE: A Dataset for Attention-Based Tasks with Applications to Autonomous and Assisted Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[47]  Yoichi Sato,et al.  Calibration-free gaze sensing using saliency maps , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[48]  Pierre Kornprobst,et al.  Mathematical problems in image processing - partial differential equations and the calculus of variations , 2010, Applied mathematical sciences.

[49]  Qi Zhao,et al.  SALICON: Saliency in Context , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  T. Anderson,et al.  Eye movements in patients with neurodegenerative disorders , 2013, Nature Reviews Neurology.

[51]  Hyunwoo Kim,et al.  Mixed Effects Neural Networks (MeNets) With Applications to Gaze Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Qiang Ji,et al.  In the Eye of the Beholder: A Survey of Models for Eyes and Gaze , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Cristian Sminchisescu,et al.  Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Antoine Coutrot,et al.  Toward the introduction of auditory information in dynamic visual attention models , 2013, 2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS).

[55]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[56]  Junwei Han,et al.  A Deep Spatial Contextual Long-Term Recurrent Convolutional Network for Saliency Detection , 2016, IEEE Transactions on Image Processing.

[57]  Matthias Bethge,et al.  Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet , 2014, ICLR.