Modelling Task-Dependent Eye Guidance to Objects in Pictures

We introduce a model of attentional eye guidance based on the rationale that the deployment of gaze is to be considered in the context of a general action-perception loop relying on two strictly intertwined processes: sensory processing, depending on current gaze position, identifies sources of information that are most valuable under the given task; motor processing links such information with the oculomotor act by sampling the next gaze position and thus performing the gaze shift. In such a framework, the choice of where to look next is task-dependent and oriented to classes of objects embedded within pictures of complex scenes. The dependence on task is taken into account by exploiting the value and the payoff of gazing at certain image patches or proto-objects that provide a sparse representation of the scene objects. The different levels of the action-perception loop are represented in probabilistic form and eventually give rise to a stochastic process that generates the gaze sequence. This way the model also accounts for statistical properties of gaze shifts such as individual scan path variability. Results of the simulations are compared either with experimental data derived from publicly available datasets and from our own experiments.

[1]  L. Pessoa,et al.  Emotion processing and the amygdala: from a 'low road' to 'many roads' of evaluating biological significance , 2010, Nature Reviews Neuroscience.

[2]  Lawrence W. Stark,et al.  Top-down guided eye movements , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[3]  Pietro Perona,et al.  Optimal reward harvesting in complex perceptual environments , 2010, Proceedings of the National Academy of Sciences.

[4]  C. Koch,et al.  Task-demands can immediately reverse the effects of sensory-driven saliency in complex visual stimuli. , 2008, Journal of vision.

[5]  Dana H. Ballard,et al.  Eye Movements for Reward Maximization , 2003, NIPS.

[6]  G. Humphreys,et al.  Computational models of visual selective attention: A review , 2005 .

[7]  Injong Rhee,et al.  On the levy-walk nature of human mobility , 2011, TNET.

[8]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[9]  Kunio Kashino,et al.  Dynamic Markov random fields for stochastic modeling of visual attention , 2008, 2008 19th International Conference on Pattern Recognition.

[10]  P. Perona,et al.  Objects predict fixations better than early saliency. , 2008, Journal of vision.

[11]  David E. Irwin,et al.  Visual Search has Memory , 2001, Psychological science.

[12]  J. Fuster Upper processing stages of the perception–action cycle , 2004, Trends in Cognitive Sciences.

[13]  D. Ballard,et al.  Eye guidance in natural vision: reinterpreting salience. , 2011, Journal of vision.

[14]  Kae Nakamura,et al.  Basal ganglia orient eyes to reward. , 2006, Journal of neurophysiology.

[15]  Christof Koch,et al.  Predicting human gaze using low-level saliency combined with face detection , 2007, NIPS.

[16]  Theo Geisel,et al.  The ecology of gaze shifts , 2000, Neurocomputing.

[17]  R. Lencer,et al.  Advanced analysis of free visual exploration patterns in schizophrenia , 2013, Front. Psychol..

[18]  Nathalie Guyader,et al.  A Computational Saliency Model Integrating Saccade Programming , 2009, BIOSIGNALS.

[19]  Claudius Gros,et al.  Cognition and Emotion: Perspectives of a Closing Gap , 2010, Cognitive Computation.

[20]  Ali Borji,et al.  State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Peter König,et al.  Saccadic Momentum and Facilitation of Return Saccades Contribute to an Optimal Foraging Strategy , 2013, PLoS Comput. Biol..

[22]  Mary M Hayhoe,et al.  Task and context determine where you look. , 2016, Journal of vision.

[23]  Paolo Napoletano,et al.  Bayesian Integration of Face and Low-Level Cues for Foveated Video Coding , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  R. Baddeley,et al.  The long and the short of it: Spatial statistics at fixation vary with saccade amplitude and task , 2006, Vision Research.

[25]  Thomas Martinetz,et al.  Variability of eye movements when viewing dynamic natural scenes. , 2010, Journal of vision.

[26]  Puiu F. Balan,et al.  Attention as a decision in information space , 2010, Trends in Cognitive Sciences.

[27]  Tom Foulsham,et al.  Saccade control in natural images is shaped by the information visible at fixation: evidence from asymmetric gaze-contingent windows , 2011, Attention, perception & psychophysics.

[28]  M. Shadlen,et al.  Effect of Expected Reward Magnitude on the Response of Neurons in the Dorsolateral Prefrontal Cortex of the Macaque , 1999, Neuron.

[29]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[30]  W. Richards,et al.  Perception as Bayesian Inference , 2008 .

[31]  G. Zelinsky A theory of eye movements during target acquisition. , 2008, Psychological review.

[32]  Okihide Hikosaka,et al.  Reward-Dependent Gain and Bias of Visual Responses in Primate Superior Colliculus , 2003, Neuron.

[33]  Giuseppe Boccignone,et al.  Nonparametric Bayesian attentive video analysis , 2008, 2008 19th International Conference on Pattern Recognition.

[34]  Giuseppe Boccignone,et al.  Ecological Sampling of Gaze Shifts , 2014, IEEE Transactions on Cybernetics.

[35]  Ralf Engbert,et al.  Bayesian Selection of Markov Models for Symbol Sequences: Application to Microsaccadic Eye Movements , 2012, PloS one.

[36]  Todd S. Horowitz,et al.  Visual search has no memory , 1998, Nature.

[37]  Paola Campadelli,et al.  Boosted Tracking in Video , 2010, IEEE Signal Processing Letters.

[38]  Xoana G. Troncoso,et al.  Saccades and microsaccades during visual fixation, exploration, and search: foundations for a common saccadic generator. , 2008, Journal of vision.

[39]  Guido C. H. E. de Croon,et al.  Adaptive Gaze Control for Object Detection , 2011, Cognitive Computation.

[40]  L. Stark,et al.  Most naturally occurring human saccades have magnitudes of 15 degrees or less. , 1975, Investigative ophthalmology.

[41]  Josep Lladós,et al.  Towards Modelling an Attention-Based Text Localization Process , 2013, IbPRIA.

[42]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[43]  C Bundesen,et al.  A computational theory of visual attention. , 1998, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[44]  K. Berridge,et al.  Erratum to: “Parsing reward” [Trends Neurosci. 26 (2003) 507–513] , 2003, Trends in Neurosciences.

[45]  T. Foulsham,et al.  Quarterly Journal of Experimental Psychology: in press Visual saliency and semantic incongruency influence eye movements when , 2022 .

[46]  Ronald A. Rensink The Dynamic Representation of Scenes , 2000 .

[47]  Wilson S. Geisler,et al.  Optimal eye movement strategies in visual search , 2005, Nature.

[48]  Giuseppe Boccignone,et al.  Modelling gaze shift as a constrained random walk , 2004 .

[49]  Kenneth Holmqvist,et al.  Eye tracking: a comprehensive guide to methods and measures , 2011 .

[50]  R. Desimone,et al.  Neural mechanisms of selective visual attention. , 1995, Annual review of neuroscience.

[51]  Frederic Bartumeus,et al.  ANIMAL SEARCH STRATEGIES: A QUANTITATIVE RANDOM‐WALK ANALYSIS , 2005 .

[52]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[53]  Dietmar Heinke,et al.  Modelling Visual Search with the Selective Attention for Identification Model (VS-SAIM): A Novel Explanation for Visual Search Asymmetries , 2010, Cognitive Computation.

[54]  D. Knill,et al.  The Bayesian brain: the role of uncertainty in neural coding and computation , 2004, Trends in Neurosciences.

[55]  T. Poggio,et al.  What and where: A Bayesian inference theory of attention , 2010, Vision Research.

[56]  Mozer,et al.  Early parallel processing in reading: a connectionist approach. Technical report, April-November 1986 , 1986 .

[57]  C. Koch,et al.  Probabilistic modeling of eye movement data during conjunction search via feature-based attention. , 2007, Journal of vision.

[58]  Alan L. Yuille,et al.  Perception as Bayesian Inference: Introduction , 1996 .

[59]  G. Humphreys,et al.  Attention, spatial representation, and visual neglect: simulating emergent attention and spatial memory in the selective attention for identification model (SAIM). , 2003, Psychological review.

[60]  Eyal Ofek,et al.  Stroke Width Transform , 2010, CVPR 2010.

[61]  John Bibby,et al.  The Analysis of Contingency Tables , 1978 .

[62]  T D Keech,et al.  Eye movements in active visual search: A computable phenomenological model , 2010, Attention, perception & psychophysics.

[63]  L. Abbott,et al.  Two layers of neural variability , 2012, Nature Neuroscience.

[64]  Michael L. Platt,et al.  Neural correlates of decision variables in parietal cortex , 1999, Nature.

[65]  Jeremy M Wolfe,et al.  When is it time to move to the next raspberry bush? Foraging rules in human visual search. , 2013, Journal of vision.

[66]  R Parasuraman,et al.  Scale of attentional focus in visual search , 1999, Perception & psychophysics.

[67]  T. Foulsham,et al.  Eye movements during scene inspection: A test of the saliency map hypothesis , 2006 .

[68]  R. H. Phaf,et al.  SLAM: A connectionist model for attention in visual selection tasks , 1990, Cognitive Psychology.

[69]  B. Scholl Objects and attention: the state of the art , 2001, Cognition.

[70]  Yonatan Wexler,et al.  Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[71]  S R Ellis,et al.  Statistical Dependency in Visual Scanning , 1986, Human factors.

[72]  Daniel Mirman,et al.  Lévy-like diffusion in eye movements during spoken-language comprehension. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[73]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[74]  R. J. van Beers,et al.  The Sources of Variability in Saccadic Eye Movements , 2007, The Journal of Neuroscience.

[75]  H. Barlow Vision: A computational investigation into the human representation and processing of visual information: David Marr. San Francisco: W. H. Freeman, 1982. pp. xvi + 397 , 1983 .

[76]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[77]  S. Shioiri,et al.  Useful Resolution for Picture Perception as a Function of Eccentricity , 1989, Perception.

[78]  J. Maunsell Neuronal representations of cognitive state: reward or attention? , 2004, Trends in Cognitive Sciences.

[79]  Christian P. Robert,et al.  The Bayesian choice : from decision-theoretic foundations to computational implementation , 2007 .

[80]  Henrik I. Christensen,et al.  Computational visual attention systems and their cognitive foundations: A survey , 2010, TAP.

[81]  Nathalie Guyader,et al.  Improving Visual Saliency by Adding ‘Face Feature Map’ and ‘Center Bias’ , 2012, Cognitive Computation.

[82]  G. Logan The CODE theory of visual attention: an integration of space-based and object-based attention. , 1996, Psychological review.

[83]  David C. Knill,et al.  Introduction: a Bayesian formulation of visual perception , 1996 .

[84]  Evonne J. Charboneau,et al.  Obese adults have visual attention bias for food cue images: evidence for altered reward system function , 2009, International Journal of Obesity.

[85]  Andreas Dengel,et al.  How Salient is Scene Text? , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[86]  Roxanne L. Canosa,et al.  Real-world vision: Selective perception and task , 2009, TAP.

[87]  Ali Borji,et al.  An Object-Based Bayesian Framework for Top-Down Visual Attention , 2012, AAAI.

[88]  B. Anderson A value-driven mechanism of attentional selection. , 2013, Journal of vision.

[89]  Christof Koch,et al.  Modeling attention to salient proto-objects , 2006, Neural Networks.

[90]  G. W. Snedecor Statistical Methods , 1964 .

[91]  Lawrence W. Stark,et al.  Visual perception and sequences of eye movement fixations: a stochastic modeling approach , 1992, IEEE Trans. Syst. Man Cybern..

[92]  Aline Roumy,et al.  Prediction of the inter-observer visual congruency (IOVC) and application to image ranking , 2011, ACM Multimedia.

[93]  L. Pessoa On the relationship between emotion and cognition , 2008, Nature Reviews Neuroscience.

[94]  Andreas Krause,et al.  Optimal Value of Information in Graphical Models , 2009, J. Artif. Intell. Res..

[95]  A Treisman,et al.  Feature binding, attention and object perception. , 1998, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[96]  K. Berridge,et al.  Parsing reward , 2003, Trends in Neurosciences.

[97]  L. Itti,et al.  Modeling the influence of task on attention , 2005, Vision Research.

[98]  Giuseppe Boccignone,et al.  Feed and fly control of visual scanpaths for foveation image processing , 2012, annals of telecommunications - annales des télécommunications.

[99]  B. Tatler,et al.  The prominence of behavioural biases in eye guidance , 2009 .

[100]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[101]  C. Erkelens,et al.  Coarse-to-fine eye movement strategy in visual search , 2007, Vision Research.

[102]  Robert B. Fisher,et al.  A computer vision model for visual-object-based attention and eye movements , 2008, Comput. Vis. Image Underst..

[103]  Gary Feng,et al.  Eye movements as time-series random variables: A stochastic model of eye movement control in reading , 2006, Cognitive Systems Research.

[104]  C. Koch,et al.  Faces and text attract gaze independent of the task: Experimental data and computer model. , 2009, Journal of vision.

[105]  C. Mallows,et al.  A Method for Simulating Stable Random Variables , 1976 .

[106]  David Castle,et al.  Current visual scanpath research: a review of investigations into the psychotic, anxiety, and mood disorders. , 2011, Comprehensive psychiatry.

[107]  Alec Solway,et al.  Goal-directed decision making as probabilistic inference: a computational framework and potential neural correlates. , 2012, Psychological review.

[108]  Rajesh P. N. Rao,et al.  Eye movements in iconic visual search , 2002, Vision Research.

[109]  T. Foulsham,et al.  What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition. , 2008, Journal of vision.

[110]  Alexander C. Schütz,et al.  Eye movements and perception: a selective review. , 2011, Journal of vision.

[111]  Preeti Verghese,et al.  The psychophysics of visual search , 2000, Vision Research.

[112]  H. J. Muller,et al.  SEarch via Recursive Rejection (SERR): A Connectionist Model of Visual Search , 1993, Cognitive Psychology.

[113]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[114]  I. Rentschler,et al.  Peripheral vision and pattern recognition: a review. , 2011, Journal of vision.

[115]  Benjamin W. Tatler,et al.  Systematic tendencies in scene viewing , 2008 .

[116]  Maja Pantic,et al.  Social signal processing: Survey of an emerging domain , 2009, Image Vis. Comput..

[117]  Hsueh-Cheng Wang,et al.  The attraction of visual attention to texts in real-world scenes. , 2012, Journal of vision.

[118]  T. Foulsham,et al.  It depends on how you look at it: Scanpath comparison in multiple dimensions with MultiMatch, a vector-based approach , 2012, Behavior Research Methods.

[119]  Yukie Nagai Stability and sensitivity of bottom-up visual attention for dynamic scene analysis , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[120]  Jochen J. Steil,et al.  Where to Look Next? Combining Static and Dynamic Proto-objects in a TVA-based Model of Visual Attention , 2010, Cognitive Computation.