Improving Visual Saliency by Adding ‘Face Feature Map’ and ‘Center Bias’

Faces play an important role in guiding visual attention, and thus, the inclusion of face detection into a classical visual attention model can improve eye movement predictions. In this study, we proposed a visual saliency model to predict eye movements during free viewing of videos. The model is inspired by the biology of the visual system and breaks down each frame of a video database into three saliency maps, each earmarked for a particular visual feature. (a) A ‘static’ saliency map emphasizes regions that differ from their context in terms of luminance, orientation and spatial frequency. (b) A ‘dynamic’ saliency map emphasizes moving regions with values proportional to motion amplitude. (c) A ‘face’ saliency map emphasizes areas where a face is detected with a value proportional to the confidence of the detection. In parallel, a behavioral experiment was carried out to record eye movements of participants when viewing the videos. These eye movements were compared with the models’ saliency maps to quantify their efficiency. We also examined the influence of center bias on the saliency maps and incorporated it into the model in a suitable way. Finally, we proposed an efficient fusion method of all these saliency maps. Consequently, the fused master saliency map developed in this research is a good predictor of participants’ eye positions.

[1]  Jennifer A. Mangels,et al.  Predictive Codes for Forthcoming Perception in the Frontal Cortex , 2006, Science.

[2]  J. Wolfe,et al.  What attributes guide the deployment of visual attention and how do they do it? , 2004, Nature Reviews Neuroscience.

[3]  T. Allison,et al.  Electrophysiological Studies of Face Perception in Humans , 1996, Journal of Cognitive Neuroscience.

[4]  J. Findlay,et al.  The effect of visual attention on peripheral discrimination thresholds in single and multiple element displays. , 1988, Acta psychologica.

[5]  John M. Henderson,et al.  Clustering of Gaze During Dynamic Scene Viewing is Predicted by Motion , 2011, Cognitive Computation.

[6]  D. Pellerin,et al.  Robust motion estimation using spatial Gabor filters , 2000, 2000 10th European Signal Processing Conference.

[7]  Xing Xie,et al.  A visual attention model for adapting images on small displays , 2003, Multimedia Systems.

[8]  T. Wiesel,et al.  Functional architecture of macaque monkey visual cortex , 1977 .

[9]  J. Theeuwes,et al.  Faces capture attention: Evidence from inhibition of return , 2006 .

[10]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[11]  P. Vuilleumier,et al.  Faces call for attention: evidence from patients with visual extinction , 2000, Neuropsychologia.

[12]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[13]  L. Zhaoping Attention capture by eye of origin singletons even without awareness--a hallmark of a bottom-up saliency map in the primary visual cortex. , 2008, Journal of vision.

[14]  Jean-Marc Odobez,et al.  Robust Multiresolution Estimation of Parametric Motion Models , 1995, J. Vis. Commun. Image Represent..

[15]  Karl J. Friston,et al.  Where bottom-up meets top-down: neuronal interactions during perception and imagery. , 2004, Cerebral cortex.

[16]  M. Cheal,et al.  Central and Peripheral Precuing of Forced-Choice Discrimination , 1991, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[17]  F. Ohl,et al.  Fallacies in behavioural interpretation of auditory cortex plasticity , 2004, Nature Reviews Neuroscience.

[18]  Mark H. Johnson Subcortical face processing , 2005, Nature Reviews Neuroscience.

[19]  H. Wilson,et al.  fMRI evidence for the neural representation of faces , 2005, Nature Neuroscience.

[20]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[21]  Preeti Verghese,et al.  Where to look next? Eye movements reduce local uncertainty. , 2007, Journal of vision.

[22]  A. Kingstone,et al.  Gaze selection in complex social scenes , 2008 .

[23]  Christof Koch,et al.  Using semantic content as cues for better scanpath prediction , 2008, ETRA.

[24]  Eric Bruno,et al.  Robust motion estimation using spatial Gabor-like filters , 2002, Signal Process..

[25]  N. Lavie,et al.  Changing Faces: A Detection Advantage in the Flicker Paradigm , 2001, Psychological science.

[26]  Christof Koch,et al.  Predicting human gaze using low-level saliency combined with face detection , 2007, NIPS.

[27]  V. Bruce,et al.  Reflexive visual orienting in response to the social attention of others , 1999 .

[28]  M. Goodale,et al.  The visual brain in action , 1995 .

[29]  S. Solomon,et al.  Contrast sensitivity in natural scenes depends on edge as well as spatial frequency structure. , 2009, Journal of vision.

[30]  N. Kanwisher,et al.  The fusiform face area: a cortical region specialized for the perception of faces , 2006, Philosophical Transactions of the Royal Society B: Biological Sciences.

[31]  Gideon P Caplovitz,et al.  The maintenance and disambiguation of object representations depend upon feature contrast within and between objects. , 2011, Journal of vision.

[32]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[33]  O. Meur,et al.  Predicting visual fixations on video based on low-level visual features , 2007, Vision Research.

[34]  Thomas Martinetz,et al.  Variability of eye movements when viewing dynamic natural scenes. , 2010, Journal of vision.

[35]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[36]  Asha Iyer,et al.  Components of bottom-up gaze allocation in natural images , 2005, Vision Research.

[37]  A. Kingstone,et al.  Saliency does not account for fixations to eyes within social scenes , 2009, Vision Research.

[38]  Nathalie Guyader,et al.  A Functional and Statistical Bottom-Up Saliency Model to Reveal the Relative Contributions of Low-Level Visual Guiding Factors , 2010, Cognitive Computation.

[39]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[40]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[41]  Laurent Itti,et al.  Applying computational tools to predict gaze direction in interactive visual environments , 2008, TAP.

[42]  L. Itti,et al.  Quantifying center bias of observers in free viewing of dynamic natural scenes. , 2009, Journal of vision.

[43]  S. Baron-Cohen,et al.  Gaze Perception Triggers Reflexive Visuospatial Orienting , 1999 .

[44]  Marat Sophie,et al.  Gaze Prediction Improvement by Adding a Face Feature to a Saliency Model , 2009 .

[45]  Shaul Hochstein,et al.  The wide window of face detection. , 2010, Journal of vision.

[46]  U. Leonards,et al.  What makes cast shadows hard to see? , 2010, Journal of vision.

[47]  Christof Koch,et al.  Learning a saliency map using fixated locations in natural scenes. , 2011, Journal of vision.

[48]  L. Itti,et al.  Visual causes versus correlates of attentional selection in dynamic scenes , 2006, Vision Research.

[49]  Lie Lu,et al.  A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..

[50]  M. Doherty,et al.  The control of attention to faces. , 2007, Journal of vision.

[51]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[52]  Nathalie Guyader,et al.  Modelling Spatio-Temporal Saliency to Predict Gaze Direction for Short Videos , 2009, International Journal of Computer Vision.

[53]  Eli Brenner,et al.  Flexibility in intercepting moving objects. , 2007, Journal of vision.

[54]  H J Müller,et al.  Movement versus focusing of visual attention , 1989, Perception & psychophysics.

[55]  H. J. Muller,et al.  Reflexive and voluntary orienting of visual attention: time course of activation and resistance to interruption. , 1989, Journal of experimental psychology. Human perception and performance.

[56]  Jan-Mark Geusebroek,et al.  An Image Statistics–Based Model for Fixation Prediction , 2010, Cognitive Computation.