Exploitation of Gaze Data for Photo Region Labeling in an Immersive Environment

Metadata describing the content of photos are of high importance for applications like image search or as part of training sets for object detection algorithms. In this work, we apply tags to image regions for a more detailed description of the photo semantics. This region labeling is performed without additional effort from the user, just from analyzing eye tracking data, recorded while users are playing a gaze-controlled game. In the game EyeGrab, users classify and rate photos falling down the screen. The photos are classified according to a given category under time pressure. The game has been evaluated in a study with 54 subjects. The results show that it is possible to assign the given categories to image regions with a precision of up to 61%. This shows that we can perform an almost equally good region labeling using an immersive environment like in EyeGrab compared to a previous classification experiment that was much more controlled.

[1]  Arto Klami,et al.  Inferring task-relevant image regions from gaze data , 2010, 2010 IEEE International Workshop on Machine Learning for Signal Processing.

[2]  Michael G. Strintzis,et al.  A World Wide Web region-based image search engine , 2001, Proceedings 11th International Conference on Image Analysis and Processing.

[3]  David Salesin,et al.  Gaze-based interaction for semi-automatic photo cropping , 2006, CHI.

[4]  Yu-Tzu Lin,et al.  Real-time eye-gaze estimation using a low-resolution webcam , 2012, Multimedia Tools and Applications.

[5]  Shuicheng Yan,et al.  Purposive Hidden-Object-Game: Embedding Human Computation in Popular Game , 2012, IEEE Trans. Multim..

[6]  Samuel Kaski,et al.  GaZIR: gaze-based zooming interface for image retrieval , 2009, ICMI-MLMI '09.

[7]  T. C. Nicholas Graham,et al.  Use of eye movements for video game control , 2006, ACE '06.

[8]  Joseph H. Goldberg,et al.  Eye tracking in web search tasks: design implications , 2002, ETRA.

[9]  Manuel Blum,et al.  Peekaboom: a game for locating objects in images , 2006, CHI.

[10]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[11]  Ansgar Scherp,et al.  EyeGrab: A Gaze-based Game with a Purpose to Enrich Image Context Information , 2012, EuroHCIR.

[12]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Jian Dong,et al.  Purposive Hidden-Object-Game: Embedding Human Computation in Popular Game , 2012, IEEE Transactions on Multimedia.

[15]  Steffen Staab,et al.  Identifying Objects in Images from Analyzing the Users' Gaze Movements for Provided Tags , 2012, MMM.

[16]  Matthieu Guillaumin,et al.  Combining Image-Level and Segment-Level Models for Automatic Annotation , 2012, MMM.

[17]  Steffen Staab,et al.  Can You See It? Two Novel Eye-Tracking-Based Measures for Assigning Tags to Image Regions , 2013, MMM.