论文信息 - Few-Shot Adaptive Gaze Estimation

Few-Shot Adaptive Gaze Estimation

Inter-personal anatomical differences limit the accuracy of person-independent gaze estimation networks. Yet there is a need to lower gaze errors further to enable applications requiring higher quality. Further gains can be achieved by personalizing gaze networks, ideally with few calibration samples. However, over-parameterized neural networks are not amenable to learning from few examples as they can quickly over-fit. We embrace these challenges and propose a novel framework for Few-shot Adaptive GaZE Estimation (Faze) for learning person-specific gaze networks with very few (≤ 9) calibration samples. Faze learns a rotation-aware latent representation of gaze via a disentangling encoder-decoder architecture along with a highly adaptable gaze estimator trained using meta-learning. It is capable of adapting to any new person to yield significant performance gains with as few as 3 samples, yielding state-of-the-art performance of 3.18-deg on GazeCapture, a 19% improvement over prior art. We open-source our code at https://github.com/NVlabs/few_shot_gaze

[1] Gang Liu,et al. Deep Multitask Gaze Estimation with a Constrained Landmark-Gaze Model , 2018, ECCV Workshops.

[2] Andrea Vedaldi,et al. Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[3] Yusuke Sugano,et al. Revisiting data normalization for appearance-based gaze estimation , 2018, ETRA.

[4] V. Lepetit,et al. EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[5] Daan Wierstra,et al. One-Shot Generalization in Deep Generative Models , 2016, ICML.

[6] Yoichi Sato,et al. Learning-by-Synthesis for Appearance-Based 3D Gaze Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7] Joshua Achiam,et al. On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[8] Marcin Andrychowicz,et al. Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[9] Joshua B. Tenenbaum,et al. Human-level concept learning through probabilistic program induction , 2015, Science.

[10] Daan Wierstra,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[11] Otmar Hilliges,et al. Learning to find eye region landmarks for remote gaze estimation in unconstrained settings , 2018, ETRA.

[12] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] José M. F. Moura,et al. Few-Shot Human Motion Prediction via Meta-learning , 2018, ECCV.

[14] Moshe Eizenman,et al. General theory of remote gaze estimation using the pupil center and corneal reflections , 2006, IEEE Transactions on Biomedical Engineering.

[15] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[16] Narendra Ahuja,et al. Appearance-based eye gaze estimation , 2002, Sixth IEEE Workshop on Applications of Computer Vision, 2002. (WACV 2002). Proceedings..

[17] Qiang Ji,et al. A Hierarchical Generative Model for Eye Image Synthesis and Eye Gaze Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18] Gang Liu,et al. A Differential Approach for Gaze Estimation with Calibration , 2018, BMVC.

[19] Otmar Hilliges,et al. Deep Pictorial Gaze Estimation , 2018, ECCV.

[20] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[21] M. Betke,et al. The Camera Mouse: visual tracking of body features to provide computer access for people with severe disabilities , 2002, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[22] Takahiro Okabe,et al. Inferring human gaze from appearance via adaptive linear regression , 2011, 2011 International Conference on Computer Vision.

[23] Alexei A. Efros,et al. Few-Shot Segmentation Propagation with Guided Networks , 2018, ArXiv.

[24] Sepp Hochreiter,et al. Self-Normalizing Neural Networks , 2017, NIPS.

[25] Wojciech Matusik,et al. Eye Tracking for Everyone , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Mario Fritz,et al. MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27] Qiang Ji,et al. In the Eye of the Beholder: A Survey of Models for Eyes and Gaze , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[29] R. Pieters,et al. A Review of Eye-Tracking Research in Marketing , 2008 .

[30] Yusuke Sugano,et al. Evaluation of Appearance-Based Methods and Implications for Gaze-Based Applications , 2019, CHI.

[31] Gabriel J. Brostow,et al. Interpretable Transformations with Encoder-Decoder Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[32] Mario Fritz,et al. It’s Written All Over Your Face: Full-Face Appearance-Based Gaze Estimation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[33] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[34] Alex Fridman,et al. Cognitive Load Estimation in the Wild , 2018, CHI.

[35] William J. Christmas,et al. A Multiresolution 3D Morphable Face Model and Fitting Framework , 2016, VISIGRAPP.

[36] John L. Sibert,et al. The reading assistant: eye gaze triggered auditory prompting for reading remediation , 2000, UIST '00.

[37] Mario Fritz,et al. Appearance-based gaze estimation in the wild , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Sina Honari,et al. Improving Landmark Localization with Semi-Supervised Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39] Wangjiang Zhu,et al. Monocular Free-Head 3D Gaze Tracking with Deep Learning and Geometry Constraints , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[40] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[41] Mohan M. Trivedi,et al. Where is the driver looking: Analysis of head, eye and iris for robust gaze zone estimation , 2014, 17th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[42] Takahiro Okabe,et al. A Head Pose-free Approach for Appearance-based Gaze Estimation , 2011, BMVC.

[43] Tomas Pfister,et al. Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Jeff Huang,et al. SearchGazer: Webcam Eye Tracking for Remote Studies of Web Search , 2017, CHIIR.

[45] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.

[46] Jörg Müller,et al. GazeHorizon: enabling passers-by to interact with public displays by gaze , 2014, UbiComp.

[47] Peiyun Hu,et al. Finding Tiny Faces , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48] Yiannis Demiris,et al. RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments , 2018, ECCV.

[49] Jan Kautz,et al. Light-Weight Head Pose Invariant Gaze Tracking , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[50] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51] Geoffrey E. Hinton,et al. Transforming Auto-Encoders , 2011, ICANN.

[52] Alexander C. Berg,et al. Meta-Tracker: Fast and Robust Online Adaptation for Visual Object Trackers , 2018, ECCV.

[53] Feng Lu,et al. Appearance-Based Gaze Estimation via Evaluation-Guided Asymmetric Regression , 2018, ECCV.

[54] Andreas Dengel,et al. Text 2.0 , 2010, CHI EA '10.

[55] Hong Va Leong,et al. StressClick: Sensing Stress from Gaze-Click Patterns , 2016, ACM Multimedia.

[56] Joohwan Kim,et al. Perceptually-based foveated virtual reality , 2016, SIGGRAPH Emerging Technologies.

[57] Pascal Fua,et al. Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation , 2018, ECCV.

[58] Hoon Kim,et al. Simulated+Unsupervised Learning With Adaptive Data Generation and Bidirectional Mappings , 2018, ICLR.

[59] Jiankang Deng,et al. Cascade Multiview Hourglass Model for Robust 3 D Face Alignment , 2018 .

[60] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.