论文信息 - Neural Networks for Semantic Gaze Analysis in XR Settings

Neural Networks for Semantic Gaze Analysis in XR Settings

Virtual-reality (VR) and augmented-reality (AR) technology is increasingly combined with eye-tracking. This combination broadens both fields and opens up new areas of application, in which visual perception and related cognitive processes can be studied in interactive but still well controlled settings. However, performing a semantic gaze analysis of eye-tracking data from interactive three-dimensional scenes is a resource-intense task, which so far has been an obstacle to economic use. In this paper we present a novel approach which minimizes time and information necessary to annotate volumes of interest (VOIs) by using techniques from object recognition. To do so, we train convolutional neural networks (CNNs) on synthetic data sets derived from virtual models using image augmentation techniques. We evaluate our method in real and virtual environments, showing that the method can compete with state-of-the-art approaches, while not relying on additional markers or preexisting databases but instead offering cross-platform use.

Robert Refflinghaus | Dominik Dürrschnabel | Lena Stubbemann

[1] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2] Kenneth Holmqvist,et al. Eye tracking: a comprehensive guide to methods and measures , 2011 .

[3] Matthijs Douze,et al. Fixing the train-test resolution discrepancy , 2019, NeurIPS.

[4] 拓海杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[5] Jakub Simko,et al. Screen recording segmentation to scenes for eye-tracking analysis , 2018, Multimedia Tools and Applications.

[6] Michael Burch,et al. Eye Tracking in Computer-Based Visualization , 2015, Computing in Science & Engineering.

[7] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[8] Johann Schrammel,et al. 3D attention: measurement of visual saliency using eye tracking glasses , 2013, CHI Extended Abstracts.

[9] Gunther Heidemann,et al. Pixel-wise Ground Truth Annotation in Videos - An Semi-automatic Approach for Pixel-wise and Semantic Object Annotation , 2016, ICPRAM.

[10] Thies Pfeiffer,et al. EyeSee3D: a low-cost approach for analyzing mobile 3D eye tracking data using computer vision and augmented reality technology , 2014, ETRA.

[11] Thies Pfeiffer,et al. EyeSee3D 2.0: model-based real-time analysis of mobile eye-tracking in static and dynamic three-dimensional scenes , 2016, ETRA.

[12] Pascal Bertolino. Sensarea: An authoring tool to create accurate clickable videos , 2012, 2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI).

[13] Pascal Bertolino,et al. Sensarea, a general public video editing application , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[14] Mario Fritz,et al. Predicting the Category and Attributes of Visual Search Targets Using Deep Gaze Pooling , 2016, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[15] Thomas Kieninger,et al. Gaze guided object recognition using a head-mounted eye tracker , 2012, ETRA '12.

[16] Xindong Wu,et al. Object Detection With Deep Learning: A Review , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[17] Yuri Borgianni,et al. Review of the use of neurophysiological and biometric measures in experimental design research , 2020, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[18] Vsevolod Peysakhovich,et al. Using Pose Estimation to Map Gaze to Detected Fiducial Markers , 2020, KES.

[19] Didier Stricker,et al. Detection and Identification Techniques for Markers Used in Computer Vision , 2010, VLUDS.

[20] Daniel Weiskopf,et al. Visual Analytics for Mobile Eye Tracking , 2017, IEEE Transactions on Visualization and Computer Graphics.

[21] Daniel Cremers,et al. What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation? , 2018, International Journal of Computer Vision.

[22] Toon Goedemé,et al. Towards a more effective method for analyzing mobile eye-tracking data: integrating gaze data with object recognition algorithms , 2011, PETMEI '11.

[23] Jeff B. Pelz,et al. SemantiCode: using content similarity and database-driven matching to code wearable eyetracker gaze data , 2010, ETRA.

[24] Eye Tracking Research and Applications, ETRA '14, Safety Harbor, FL, USA, March 26-28, 2014 , 2014, ETRA.

[25] A. Duchowski. Gaze-contingent visual communication , 1998 .

[26] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.

[27] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[28] Harshad Rai,et al. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .

[29] Andreas Bulling,et al. Fixation detection for head-mounted eye tracking based on visual similarity of gaze targets , 2018, ETRA.