Spatio-temporal saliency model to predict eye movements in video free viewing

This paper presents a spatio-temporal saliency model that predicts eye movements. This biologically inspired model separated a video frame into two signals corresponding to the two main outputs of the retina (parvocellular and magnocellular outputs). Both signals are then decomposed into elementary feature maps by cortical-like filters. These feature maps are then used to form two saliency maps: a static one and a dynamic one. These maps are fused into a spatio-temporal saliency map. The model is evaluated by comparing the salient areas of each frame predicted by these saliency maps (static, dynamic, spatio-temporal) to the eye positions of different subjects during a video free viewing experiment with a large database (17000 frames).

[1]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[2]  Lie Lu,et al.  A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..

[3]  Jeanny Hérault,et al.  Realistic Simulation Tool for Early Visual Processing Including Space, Time and Colour Data , 1993, IWANN.

[4]  P Reinagel,et al.  Natural scene statistics at the centre of gaze. , 1999, Network.

[5]  Patrick Le Callet,et al.  A spatio-temporal model of the selective human visual attention , 2005, IEEE International Conference on Image Processing 2005.

[6]  Laurent Itti,et al.  Applying computational tools to predict gaze direction in interactive visual environments , 2008, TAP.

[7]  L. Itti,et al.  Visual causes versus correlates of attentional selection in dynamic scenes , 2006, Vision Research.

[8]  Jean-Marc Odobez,et al.  Robust Multiresolution Estimation of Parametric Motion Models , 1995, J. Vis. Commun. Image Represent..

[9]  Eric Bruno,et al.  Robust motion estimation using spatial Gabor-like filters , 2002, Signal Process..

[10]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[11]  S Marcelja,et al.  Mathematical description of the responses of simple cortical cells. , 1980, Journal of the Optical Society of America.

[12]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[13]  Heiko Neumann,et al.  Recurrent Long-Range Interactions in Early Vision , 2001, Emergent Neural Computational Architectures Based on Neuroscience.

[14]  D. Navon Forest before trees: The precedence of global features in visual perception , 1977, Cognitive Psychology.