The Assessment of Sencogi: A Visual Complexity Model Predicting Visual Fixations

This paper investigates whether a frequency-domain model of complexity can accurately predict human visual salience maps. The Sencogi model uses the frequency domain to calculate maps of spatial (i.e., static) and temporal (i.e., dynamic) complexity. This study compares the complexity maps generated by Sencogi to human fixation maps obtained during a visual quality assessment task on static images. This work is the first part of an ongoing multi-step study designed to assess whether fixation maps are an accurate representation of saliency for spatio-temporal scenes. A supporting experiment confirmed that top-down factors, such as scene type, task or emotional states, did not affect human fixation maps. Results show that the Sencogi visual complexity model estimates human eye fixations of images with prediction scores that are significantly above a Chance baseline and is able to compete with a Single Observer baseline. We conclude that the Sencogi visual complexity model is able to predict human fixations in the spatial domain. The next studies will focus on the assessment of Sencogi’s performance predicting visual fixations in the spatio-temporal domain.

[1]  I. Patras,et al.  Spatiotemporal salient points for visual recognition of human actions , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[2]  Maria Laura Mele,et al.  Sencogi Spatio-Temporal Saliency: A New Metric for Predicting Subjective Video Quality on Mobile Devices , 2018, HCI.

[3]  Nuno Vasconcelos,et al.  Spatiotemporal Saliency in Dynamic Scenes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Zhi-Hua Zhou,et al.  Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques , 2015, Lecture Notes in Computer Science.

[5]  Judith Redi,et al.  Interactions of visual attention and quality perception , 2011, Electronic Imaging.

[6]  Frédo Durand,et al.  A Benchmark of Computational Models of Saliency to Predict Human Fixations , 2012 .

[7]  H. Jasper Report of the committee on methods of clinical examination in electroencephalography , 1958 .

[8]  Ping Zhang,et al.  Saliency detection via background and foreground null space learning , 2019, Signal Process. Image Commun..

[9]  Wei Zhang,et al.  The Application of Visual Saliency Models in Objective Image Quality Assessment: A Statistical Evaluation , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[10]  L. Itti,et al.  Top-down influences on visual attention during listening are modulated by observer sex , 2012, Vision Research.

[11]  Hubert Konik,et al.  A Spatiotemporal Saliency Model for Video Surveillance , 2011, Cognitive Computation.

[12]  Maria Laura Mele,et al.  Using Spatio-Temporal Saliency to Predict Subjective Video Quality: A New High-Speed Objective Assessment Metric , 2017, HCI.

[13]  Richard P. Wildes,et al.  Spatiotemporal Salience via Centre-Surround Comparison of Visual Spacetime Orientations , 2012, ACCV.

[14]  Frédo Durand,et al.  What Do Different Evaluation Metrics Tell Us About Saliency Models? , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  M. Schur,et al.  A Principle of Evolutionary Biology for Psychoanalysis: Schneirla's Evolutionary and Developmental Theory of Biphasic Processes Underlying Approach and Withdrawal and Freud's Unpleastire and Pleaswe Principles , 1970, Journal of the American Psychoanalytic Association.

[16]  Maria Laura Mele,et al.  The Web-based Subjective Quality Assessment of an Adaptive Image Compression Plug-in , 2017, VISIGRAPP.

[17]  A. L. I︠A︡rbus Eye Movements and Vision , 1967 .

[18]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[19]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[20]  Nan Mu,et al.  A spatial-frequency-temporal domain based saliency model for low contrast video sequences , 2019, J. Vis. Commun. Image Represent..

[21]  Shenmin Zhang,et al.  What do saliency models predict? , 2014, Journal of vision.

[22]  H. Jasper,et al.  The ten-twenty electrode system of the International Federation. The International Federation of Clinical Neurophysiology. , 1999, Electroencephalography and clinical neurophysiology. Supplement.

[23]  Dinesh Kumar Vishwakarma,et al.  A review of state-of-the-art techniques for abnormal human activity recognition , 2019, Eng. Appl. Artif. Intell..

[24]  Frédo Durand,et al.  Where Should Saliency Models Look Next? , 2016, ECCV.

[25]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[26]  Ali Borji,et al.  What stands out in a scene? A study of human explicit saliency judgment , 2013, Vision Research.

[27]  Yannis Avrithis,et al.  Dense saliency-based spatiotemporal feature points for action recognition , 2009, CVPR.

[28]  E. Roy John,et al.  Neurometrics: Clinical applications of quantitative electrophysiology , 1977 .

[29]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[30]  Thierry Baccino,et al.  Methods for comparing scanpaths and saliency maps: strengths and weaknesses , 2012, Behavior Research Methods.

[31]  Alain Trémeau,et al.  Spatio-temporal Saliency Detection in Dynamic Scenes Using Local Binary Patterns , 2014, 2014 22nd International Conference on Pattern Recognition.

[32]  M. L. Mele,et al.  Believing Is Seeing: Fixation Duration Predicts Implicit Negative Attitudes , 2014, PloS one.

[33]  Leon A. Gatys,et al.  Understanding Low- and High-Level Contributions to Fixation Prediction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.