Saccade Sequence Prediction: Beyond Static Saliency Maps

Visual attention is a field with a considerable history, with eye movement control and prediction forming an important subfield. Fixation modeling in the past decades has been largely dominated computationally by a number of highly influential bottom-up saliency models, such as the Itti-Koch-Niebur model. The accuracy of such models has dramatically increased recently due to deep learning. However, on static images the emphasis of these models has largely been based on non-ordered prediction of fixations through a saliency map. Very few implemented models can generate temporally ordered human-like sequences of saccades beyond an initial fixation point. Towards addressing these shortcomings we present STAR-FC, a novel multi-saccade generator based on a central/peripheral integration of deep learning-based saliency and lower-level feature-based saliency. We have evaluated our model using the CAT2000 database, successfully predicting human patterns of fixation with equivalent accuracy and quality compared to what can be achieved by using one human sequence to predict another. This is a significant improvement over fixation sequences predicted by state-of-the-art saliency algorithms.

[1]  J. Douglas Crawford,et al.  Neural control of three-dimensional gaze shifts , 2011 .

[2]  John K. Tsotsos,et al.  Cognitive programs: software for attention's executive , 2014, Front. Psychol..

[3]  Karon E. MacLean,et al.  Meet Me where I’m Gazing: How Shared Attention Gaze Affects Human-Robot Handover Timing , 2014, 2014 9th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[4]  Simone Frintrop,et al.  Traditional saliency reloaded: A good old model in new shape , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Yizhou Yu,et al.  Deep Contrast Learning for Salient Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Jean-Michel Loubes,et al.  Review and Perspective for Distance-Based Clustering of Vehicle Trajectories , 2016, IEEE Transactions on Intelligent Transportation Systems.

[7]  Nicolas Riche,et al.  Rare: A new bottom-up saliency model , 2012, 2012 19th IEEE International Conference on Image Processing.

[8]  W. Skaggs,et al.  The Cerebellum , 2016 .

[9]  Matthias Bethge,et al.  Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet , 2014, ICLR.

[10]  Iain D. Gilchrist,et al.  Visual correlates of fixation selection: effects of scale and time , 2005, Vision Research.

[11]  J. Victor,et al.  Temporal Encoding of Spatial Information during Active Visual Fixation , 2012, Current Biology.

[12]  John K. Tsotsos,et al.  Revisiting active perception , 2016, Autonomous Robots.

[13]  John K. Tsotsos,et al.  Joint Attention in Autonomous Driving (JAAD) , 2016, ArXiv.

[14]  Leon A. Gatys,et al.  Understanding Low- and High-Level Contributions to Fixation Prediction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Stan Sclaroff,et al.  Saliency Detection: A Boolean Map Approach , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[17]  T. Shipley,et al.  Cognitive and psychological science insights to improve climate change data visualization , 2016 .

[18]  Matthias Bethge,et al.  DeepGaze II: Reading fixations from deep features trained on object recognition , 2016, ArXiv.

[19]  Ali Borji,et al.  CAT2000: A Large Scale Fixation Dataset for Boosting Saliency Research , 2015, ArXiv.

[20]  Christopher Thomas OpenSalicon: An Open Source Implementation of the Salicon Saliency Model , 2016, ArXiv.

[21]  Stan Sclaroff,et al.  Exploiting Surroundedness for Saliency Detection: A Boolean Map Approach , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Peyman Milanfar,et al.  Static and space-time visual saliency detection by self-resemblance. , 2009, Journal of vision.

[23]  Matthias Bethge,et al.  Information-theoretic model comparison unifies saliency metrics , 2015, Proceedings of the National Academy of Sciences.

[24]  Michele A. Basso,et al.  Saccadic eye movements and the basal ganglia , 2011 .

[25]  Calden Wloka,et al.  Spatially Binned ROC: A Comprehensive Saliency Metric , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  José Santos-Victor,et al.  Vision-based navigation and environmental representations with an omnidirectional camera , 2000, IEEE Trans. Robotics Autom..

[27]  L. Stark,et al.  Scanpaths in Eye Movements during Pattern Perception , 1971, Science.

[28]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[29]  Klaus-Peter Hoffmann,et al.  The optokinetic reflex , 2011 .

[30]  Christof Koch,et al.  Image Signature: Highlighting Sparse Salient Regions , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Rosalind W. Picard,et al.  Grounded situation models for situated conversational assistants , 2007 .

[32]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[33]  W. Geisler,et al.  Optimal Eye Movement Strategies in Visual Search ( Supplement ) , 2005 .

[34]  R. Venkatesh Babu,et al.  DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations , 2015, IEEE Transactions on Image Processing.

[35]  John K. Tsotsos On the relative complexity of active vs. passive visual search , 2004, International Journal of Computer Vision.

[36]  Antón García-Díaz,et al.  Saliency from hierarchical adaptation through decorrelation and variance normalization , 2012, Image Vis. Comput..

[37]  Wilson S. Geisler,et al.  Real-time foveated multiresolution system for low-bandwidth video communication , 1998, Electronic Imaging.

[38]  Tianming Liu,et al.  Predicting eye fixations using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Eileen Kowler Eye movements: The past 25years , 2011, Vision Research.

[40]  T. Foulsham,et al.  Fixation-dependent memory for natural scenes: an experimental test of scanpath theory. , 2013, Journal of experimental psychology. General.

[41]  Qi Zhao,et al.  SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[42]  John K. Tsotsos,et al.  Attention based on information maximization , 2010 .

[43]  Albert Ali Salah,et al.  Joint Attention by Gaze Interpolation and Saliency , 2013, IEEE Transactions on Cybernetics.

[44]  A. Watson A formula for human retinal ganglion cell receptive field density as a function of visual field location. , 2014, Journal of vision.

[45]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[46]  I. Rentschler,et al.  Peripheral vision and pattern recognition: a review. , 2011, Journal of vision.

[47]  A. L. I︠A︡rbus Eye Movements and Vision , 1967 .

[48]  John K. Tsotsos,et al.  Visual Saliency Improves Autonomous Visual Search , 2014, 2014 Canadian Conference on Computer and Robot Vision.

[49]  Shang-Hong Lai,et al.  Fusing generic objectness and visual saliency for salient object detection , 2011, 2011 International Conference on Computer Vision.

[50]  P. Perona,et al.  Objects predict fixations better than early saliency. , 2008, Journal of vision.

[51]  Ali Borji,et al.  Objects do not predict fixations better than early saliency: a re-analysis of Einhauser et al.'s data. , 2013, Journal of vision.

[52]  Douglas P. Munoz,et al.  The superior colliculus , 2011 .

[53]  A. L. Yarbus,et al.  Eye Movements and Vision , 1967, Springer US.

[54]  Thomas Deselaers,et al.  What is an object? , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[55]  B. Tatler,et al.  The prominence of behavioural biases in eye guidance , 2009 .

[56]  Shu Fang,et al.  Learning Discriminative Subspaces on Random Contrasts for Image Saliency Analysis , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[57]  John K. Tsotsos,et al.  A Focus on Selection for Fixation , 2016 .

[58]  John K. Tsotsos,et al.  An Information Theoretic Model of Saliency and Visual Search , 2008, WAPCV.

[59]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.