Active Fixation Control to Predict Saccade Sequences

Visual attention is a field with a considerable history, with eye movement control and prediction forming an important subfield. Fixation modeling in the past decades has been largely dominated computationally by a number of highly influential bottom-up saliency models, such as the Itti-Koch-Niebur model. The accuracy of such models has dramatically increased recently due to deep learning. However, on static images the emphasis of these models has largely been based on non-ordered prediction of fixations through a saliency map. Very few implemented models can generate temporally ordered human-like sequences of saccades beyond an initial fixation point. Towards addressing these shortcomings we present STAR-FC, a novel multi-saccade generator based on the integration of central high-level and object-based saliency and peripheral lower-level feature-based saliency. We have evaluated our model using the CAT2000 database, successfully predicting human patterns of fixation with equivalent accuracy and quality compared to what can be achieved by using one human sequence to predict another.

[1]  Zhi Liu,et al.  Saccadic model of eye movements for free-viewing condition , 2015, Vision Research.

[2]  Jean-Michel Loubes,et al.  Review and Perspective for Distance-Based Clustering of Vehicle Trajectories , 2016, IEEE Transactions on Intelligent Transportation Systems.

[3]  José Santos-Victor,et al.  Vision-based navigation and environmental representations with an omnidirectional camera , 2000, IEEE Trans. Robotics Autom..

[4]  A. Watson A formula for human retinal ganglion cell receptive field density as a function of visual field location. , 2014, Journal of vision.

[5]  Iain D. Gilchrist,et al.  Visual correlates of fixation selection: effects of scale and time , 2005, Vision Research.

[6]  Esa Rahtu,et al.  Stochastic bottom-up fixation prediction and saccade generation , 2013, Image Vis. Comput..

[7]  Ali Borji,et al.  CAT2000: A Large Scale Fixation Dataset for Boosting Saliency Research , 2015, ArXiv.

[8]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Calden Wloka,et al.  Spatially Binned ROC: A Comprehensive Saliency Metric , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Michele A. Basso,et al.  Saccadic eye movements and the basal ganglia , 2011 .

[11]  Nicolas Riche,et al.  Rare: A new bottom-up saliency model , 2012, 2012 19th IEEE International Conference on Image Processing.

[12]  W. Skaggs,et al.  The Cerebellum , 2016 .

[13]  Matthias Bethge,et al.  Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet , 2014, ICLR.

[14]  Eileen Kowler Eye movements: The past 25years , 2011, Vision Research.

[15]  T. Foulsham,et al.  Fixation-dependent memory for natural scenes: an experimental test of scanpath theory. , 2013, Journal of experimental psychology. General.

[16]  John K. Tsotsos,et al.  An Information Theoretic Model of Saliency and Visual Search , 2008, WAPCV.

[17]  Qi Zhao,et al.  SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Matthias Bethge,et al.  Information-theoretic model comparison unifies saliency metrics , 2015, Proceedings of the National Academy of Sciences.

[19]  John K. Tsotsos,et al.  Visual Saliency Improves Autonomous Visual Search , 2014, 2014 Canadian Conference on Computer and Robot Vision.

[20]  Shang-Hong Lai,et al.  Fusing generic objectness and visual saliency for salient object detection , 2011, 2011 International Conference on Computer Vision.

[21]  Theo Geisel,et al.  The ecology of gaze shifts , 2000, Neurocomputing.

[22]  Jillian H. Fecteau,et al.  Salience, relevance, and firing: a priority map for target selection , 2006, Trends in Cognitive Sciences.

[23]  L. Stark,et al.  Scanpaths in Eye Movements during Pattern Perception , 1971, Science.

[24]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[25]  Karon E. MacLean,et al.  Meet Me where I’m Gazing: How Shared Attention Gaze Affects Human-Robot Handover Timing , 2014, 2014 9th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[26]  P. Perona,et al.  Objects predict fixations better than early saliency. , 2008, Journal of vision.

[27]  Ali Borji,et al.  Objects do not predict fixations better than early saliency: a re-analysis of Einhauser et al.'s data. , 2013, Journal of vision.

[28]  I. Rentschler,et al.  Peripheral vision and pattern recognition: a review. , 2011, Journal of vision.

[29]  John K. Tsotsos,et al.  Cognitive programs: software for attention's executive , 2014, Front. Psychol..

[30]  Yizhou Yu,et al.  Deep Contrast Learning for Salient Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  John K. Tsotsos,et al.  Attention based on information maximization , 2010 .

[32]  Albert Ali Salah,et al.  Joint Attention by Gaze Interpolation and Saliency , 2013, IEEE Transactions on Cybernetics.

[33]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[34]  W. Geisler,et al.  Optimal Eye Movement Strategies in Visual Search ( Supplement ) , 2005 .

[35]  Christopher Thomas OpenSalicon: An Open Source Implementation of the Salicon Saliency Model , 2016, ArXiv.

[36]  Stan Sclaroff,et al.  Exploiting Surroundedness for Saliency Detection: A Boolean Map Approach , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Peyman Milanfar,et al.  Static and space-time visual saliency detection by self-resemblance. , 2009, Journal of vision.

[38]  John K. Tsotsos,et al.  Revisiting active perception , 2016, Autonomous Robots.

[39]  John K. Tsotsos On the relative complexity of active vs. passive visual search , 2004, International Journal of Computer Vision.

[40]  Stan Sclaroff,et al.  Saliency Detection: A Boolean Map Approach , 2013, 2013 IEEE International Conference on Computer Vision.

[41]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[42]  T. Shipley,et al.  Cognitive and psychological science insights to improve climate change data visualization , 2016 .

[43]  Matthias Bethge,et al.  DeepGaze II: Reading fixations from deep features trained on object recognition , 2016, ArXiv.

[44]  J. Douglas Crawford,et al.  Neural control of three-dimensional gaze shifts , 2011 .

[45]  B. Tatler,et al.  Yarbus, eye movements, and vision , 2010, i-Perception.

[46]  Simone Frintrop,et al.  Traditional saliency reloaded: A good old model in new shape , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  J. Victor,et al.  Temporal Encoding of Spatial Information during Active Visual Fixation , 2012, Current Biology.

[48]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[49]  Klaus-Peter Hoffmann,et al.  The optokinetic reflex , 2011 .

[50]  Christof Koch,et al.  Image Signature: Highlighting Sparse Salient Regions , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Rosalind W. Picard,et al.  Grounded situation models for situated conversational assistants , 2007 .

[52]  Leon A. Gatys,et al.  Understanding Low- and High-Level Contributions to Fixation Prediction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[53]  Douglas P. Munoz,et al.  The superior colliculus , 2011 .

[54]  Thomas Deselaers,et al.  What is an object? , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[55]  B. Tatler,et al.  The prominence of behavioural biases in eye guidance , 2009 .

[56]  Shu Fang,et al.  Learning Discriminative Subspaces on Random Contrasts for Image Saliency Analysis , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[57]  John K. Tsotsos,et al.  A Focus on Selection for Fixation , 2016 .

[58]  Wilson S. Geisler,et al.  Real-time foveated multiresolution system for low-bandwidth video communication , 1998, Electronic Imaging.

[59]  H. Basford,et al.  Optimal eye movement strategies in visual search , 2005 .

[60]  Tianming Liu,et al.  Predicting eye fixations using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  John K. Tsotsos,et al.  Joint Attention in Autonomous Driving (JAAD) , 2016, ArXiv.

[62]  R. Venkatesh Babu,et al.  DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations , 2015, IEEE Transactions on Image Processing.

[63]  Antón García-Díaz,et al.  Saliency from hierarchical adaptation through decorrelation and variance normalization , 2012, Image Vis. Comput..