FixTag: An algorithm for identifying and tagging fixations to simplify the analysis of data collected by portable eye trackers

Video-based eye trackers produce an output video showing where a subject is looking, the subject's Point-of-Regard (POR), for each frame of a video of the scene. This information can be extremely valuable, but its analysis can be overwhelming. Analysis of eye-tracked data from portable (wearable) eye trackers is especially daunting, as the scene video may be constantly changing, rendering automatic analysis more difficult. A common way to begin analysis of POR data is to group these data into fixations. In a previous article, we compared the fixations identified (i.e., start and end marked) automatically by an algorithm to those identified manually by users (i.e., manual coders). Here, we extend this automatic identification of fixations to tagging each fixation to a Region-of-Interest (ROI). Our fixation tagging algorithm, FixTag, requires the relative 3D positions of the vertices of ROIs and calibration of the scene camera. Fixation tagging is performed by first calculating the camera projection matrices for keyframes of the scene video (captured by the eye tracker) via an iterative structure and motion recovery algorithm. These matrices are then used to project 3D ROI vertices into the keyframes. A POR for each fixation is matched to a point in the closest keyframe, which is then checked against the 2D projected ROI vertices for tagging. Our fixation tags were compared to those produced by three manual coders tagging the automatically identified fixations for two different scenarios. For each scenario, eight ROIs were defined along with the 3D positions of eight calibration points. Therefore, 17 tags were available for each fixation: 8 for ROIs, 8 for calibration points, and 1 for “other.” For the first scenario, a subject was tracked looking through products on four store shelves, resulting in 182 automatically identified fixations. Our automatic tagging algorithm produced tags that matched those produced by at least one manual coder for 181 out of the 182 fixations (99.5% agreement). For the second scenario, a subject was tracked looking at two posters on adjoining walls of a room. Our algorithm matched at least one manual coder's tag for 169 fixations out of 172 automatically identified (98.3% agreement).

[1]  Feng Li,et al.  A model-based approach to video-based eye tracking , 2008 .

[2]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[3]  Jeff B. Pelz,et al.  Fixation-identification in dynamic scenes: comparing an automated algorithm to manual coding , 2008, APGV '08.

[4]  Jean-Yves Bouguet,et al.  Camera calibration toolbox for matlab , 2001 .

[5]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[6]  Keir Mierle Open Source 3D Reconstruction , 2008 .

[7]  Reinhard Koch,et al.  Visual Modeling with a Hand-Held Camera , 2004, International Journal of Computer Vision.

[8]  Andrew T. Duchowski,et al.  Eye Tracking Methodology: Theory and Practice , 2003, Springer London.

[9]  Anand K. Gramopadhye,et al.  3D eye movement analysis for VR visual inspection training , 2002, ETRA.

[10]  Naoki Mukawa,et al.  A free-head, simple calibration, gaze tracking system that enables gaze-based interaction , 2004, ETRA.

[11]  Jeff B. Pelz,et al.  3D point-of-regard, position and head orientation from a portable monocular video-based eye tracker , 2008, ETRA '08.

[12]  D. E. Irwin,et al.  Visual Memory Within and Across Fixations , 1992 .

[13]  Jeff B. Pelz,et al.  Building a lightweight eyetracking headgear , 2004, ETRA.

[14]  Jeff B. Pelz,et al.  Compensating for eye tracker camera movement , 2006, ETRA.

[15]  Jeff B. Pelz,et al.  Head movement estimation for wearable eye tracker , 2004, ETRA.

[16]  K. Rayner Eye movements and visual cognition : scene perception and reading , 1992 .

[17]  J. Pelz,et al.  Oculomotor behavior and perceptual strategies in complex tasks , 2001, Vision Research.

[18]  Vincent Lepetit,et al.  Accurate Non-Iterative O(n) Solution to the PnP Problem , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[19]  M. Just,et al.  Eye fixations and cognitive processes , 1976, Cognitive Psychology.

[20]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[21]  Dongheng Li,et al.  openEyes: a low-cost head-mounted eye-tracking solution , 2006, ETRA.

[22]  Douglas DeCarlo,et al.  Robust clustering of eye movement recordings for quantification of visual interest , 2004, ETRA.

[23]  R. Carpenter Movements of the eyes, 2nd rev. & enlarged ed. , 1988 .

[24]  R. Carpenter,et al.  Movements of the Eyes , 1978 .

[25]  Emanuele Trucco,et al.  Introductory techniques for 3-D computer vision , 1998 .

[26]  J Merchant,et al.  Remote measurement of eye direction allowing subject motion over one cubic foot of space. , 1974, IEEE transactions on bio-medical engineering.

[27]  A. Aydin Alatan,et al.  A geometric segmentation approach for the 3D reconstruction of dynamic scenes in 2D video sequences , 2006, 2006 14th European Signal Processing Conference.

[28]  Alexander Zelinsky,et al.  Fast Radial Symmetry for Detecting Points of Interest , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Joseph H. Goldberg,et al.  Identifying fixations and saccades in eye-tracking protocols , 2000, ETRA.

[30]  Luc Van Gool,et al.  Tracking based structure and motion recovery for augmented video productions , 2001, VRST '01.

[31]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[32]  Dongheng Li,et al.  Starburst: A hybrid algorithm for video-based eye tracking combining feature-based and model-based approaches , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[33]  Marc Pollefeys,et al.  3D models from extended uncalibrated video sequences: addressing key-frame selection and projective drift , 2005, Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM'05).

[34]  Andrew T. Duchowski,et al.  Eye tracking methodology - theory and practice, 2nd Edition , 2007 .

[35]  Andrew W. Fitzgibbon,et al.  Maintaining multiple motion model hypotheses over many views to recover matching and structure , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).