Spatio-temporal analysis of eye fixations data in images

Computer vision algorithms such as image compression, image segmentation, context aware image resizing, image quality assessment and target detection, are designed with the aim of replicating our visual system. Human eye fixations recorded by using an eye tracker are typically used as a criterion for the optimisation of such algorithms. In this paper, we propose a method to analyze the spatio-temporal nature of fixations data for different observers. By studying the correlation matrix constructed based on the fixations data of different observers viewing the same image, it was found that 21 percent of the data can be accounted by one eigenvector. A visual inspection of this vector, shows that it represents the time sequence and locations of the objects in the image that observers deem as salient. The eigenvectors are used as a benchmark for evaluating the spatio-temporal performance of Itti's classic visual saliency model. Based on the results obtained from a comprehensive publicly available dataset, we show that the proposed method can be used as ground truth for evaluating the spatio-temporal performance of saliency models. Furthermore, it can be used to provide salient locations and their time sequence for a real-time image compression application with promising results.

[1]  Mark Wexler,et al.  The nonlinear structure of motion perception during smooth eye movements. , 2009, Journal of vision.

[2]  Derrick J. Parkhurst,et al.  Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[3]  Ali Alsam,et al.  Robust metric for the evaluation of visual saliency algorithms. , 2014, Journal of the Optical Society of America. A, Optics, image science, and vision.

[4]  D. Kalman A Singularly Valuable Decomposition: The SVD of a Matrix , 1996 .

[5]  Gene H. Golub,et al.  Calculating the singular values and pseudo-inverse of a matrix , 2007, Milestones in Matrix Computation.

[6]  Ali Borji,et al.  Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study , 2013, IEEE Transactions on Image Processing.

[7]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[8]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[10]  C. Koch,et al.  Faces and text attract gaze independent of the task: Experimental data and computer model. , 2009, Journal of vision.

[11]  Alan C. Bovik,et al.  GAFFE: A Gaze-Attentive Fixation Finding Engine , 2008, IEEE Transactions on Image Processing.

[12]  Puneet Sharma,et al.  What the Eye Did Not See - A Fusion Approach to Image Coding , 2012, ISVC.

[13]  Wilson S. Geisler,et al.  Real-time foveated multiresolution system for low-bandwidth video communication , 1998, Electronic Imaging.

[14]  Laurent Itti,et al.  An Integrated Model of Top-Down and Bottom-Up Attention for Optimizing Detection Speed , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Ali Alsam,et al.  Analysis of eye fixations data , 2011 .

[16]  K Suder,et al.  The Control of Low-Level Information Flow in the Visual System , 2000, Reviews in the neurosciences.

[17]  Christof Koch,et al.  Attentional Selection for Object Recognition - A Gentle Way , 2002, Biologically Motivated Computer Vision.

[18]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[19]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .