Predicting human complexity perception of real-world scenes

Perceptual load is a well-established determinant of attentional engagement in a task. So far, perceptual load has typically been manipulated by increasing either the number of task-relevant items or the perceptual processing demand (e.g. conjunction versus feature tasks). The tasks used often involved rather simple visual displays (e.g. letters or single objects). How can perceptual load be operationalized for richer, real-world images? A promising proxy is the visual complexity of an image. However, current predictive models for visual complexity have limited applicability to diverse real-world images. Here we modelled visual complexity using a deep convolutional neural network (CNN) trained to learn perceived ratings of visual complexity. We presented 53 observers with 4000 images from the PASCAL VOC dataset, obtaining 75 020 2-alternative forced choice paired comparisons across observers. Image visual complexity scores were obtained using the TrueSkill algorithm. A CNN with weights pre-trained on an object recognition task predicted complexity ratings with r = 0.83. By contrast, feature-based models used in the literature, working on image statistics such as entropy, edge density and JPEG compression ratio, only achieved r = 0.70. Thus, our model offers a promising method to quantify the perceptual load of real-world scenes through visual complexity.

[1]  Yuanzhen Li,et al.  Feature congestion: a measure of display clutter , 2005, CHI.

[2]  Gianluigi Ciocca,et al.  Predicting Complexity Perception of Real World Images , 2016, PloS one.

[3]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[4]  N. Lavie Attention, Distraction, and Cognitive Control Under Load , 2010 .

[5]  Nilli Lavie,et al.  Load-induced inattentional deafness , 2014, Attention, perception & psychophysics.

[6]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Tom Minka,et al.  TrueSkillTM: A Bayesian Skill Rating System , 2006, NIPS.

[8]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[9]  Joshua O. Eayrs,et al.  Establishing Individual Differences in Perceptual Capacity , 2018, Journal of experimental psychology. Human perception and performance.

[10]  Nilli Lavie,et al.  Load Induced Blindness , 2008, Journal of experimental psychology. Human perception and performance.

[11]  M. Glickman The Glicko system , 2011 .

[12]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[13]  Dim P. Papadopoulos,et al.  How Hard Can It Be? Estimating the Difficulty of Visual Search in an Image , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Antonino Santos,et al.  Computerized measures of visual complexity. , 2015, Acta psychologica.

[15]  R. Dolan,et al.  Attentional load and sensory competition in human vision: modulation of fMRI responses by load at fixation during task-irrelevant stimulation in the peripheral visual field. , 2005, Cerebral cortex.

[16]  Michael L. Mack,et al.  Identifying the Perceptual Dimensions of Visual Complexity of Scenes , 2004 .

[17]  M. Chait,et al.  Inattentional Deafness: Visual Load Leads to Time-Specific Suppression of Auditory Evoked Responses , 2015, The Journal of Neuroscience.

[18]  T. A. Kelley,et al.  Attention induced neural response trade-off in retinotopic cortex under load , 2016, Scientific Reports.

[19]  Noel E. O'Connor,et al.  SalGAN: Visual Saliency Prediction with Generative Adversarial Networks , 2017, ArXiv.

[20]  Nilli Lavie,et al.  The Role of Perceptual Load in Object Recognition , 2009, Journal of experimental psychology. Human perception and performance.

[21]  Geraint Rees,et al.  Perceptual load modulates conscious flicker perception. , 2007, Journal of vision.

[22]  Nilli Lavie,et al.  The role of perceptual load in inattentional blindness , 2007, Cognition.

[23]  Rita Cucchiara,et al.  A deep multi-level network for saliency prediction , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[24]  Nilli Lavie,et al.  Visual perceptual load induces inattentional deafness , 2011, Attention, perception & psychophysics.

[25]  B. Bahrami,et al.  Attentional Load Modulates Responses of Human Primary Visual Cortex to Invisible Stimuli , 2007, Current Biology.

[26]  Geraint Rees,et al.  Perceptual load alters visual excitability. , 2011, Journal of experimental psychology. Human perception and performance.

[27]  Li Fei-Fei,et al.  Neural mechanisms of rapid natural scene categorization in human visual cortex , 2009, Nature.

[28]  N. Lavie Distracted and confused?: Selective attention under load , 2005, Trends in Cognitive Sciences.

[29]  Dimitris Samaras,et al.  Modeling visual clutter perception using proto-object segmentation. , 2014, Journal of vision.

[30]  Hal S. Stern,et al.  Designing a College Football Playoff System , 1999 .

[31]  Thomas Hofmann,et al.  TrueSkill™: A Bayesian Skill Rating System , 2007 .

[32]  L. Thurstone A law of comparative judgment. , 1994 .

[33]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[34]  N. Lavie Perceptual load as a necessary condition for selective attention. , 1995, Journal of experimental psychology. Human perception and performance.

[35]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[36]  Sarah Walker,et al.  Ultra-rapid categorization requires visual attention: Scenes with multiple foreground objects. , 2008, Journal of vision.

[37]  E. Viding,et al.  Load theory of selective attention and cognitive control. , 2004, Journal of experimental psychology. General.

[38]  A. Rezaee Jordehi,et al.  Particle swarm optimisation for discrete optimisation problems: a review , 2012, Artificial Intelligence Review.

[39]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[40]  Sabine Süsstrunk,et al.  Measuring colorfulness in natural images , 2003, IS&T/SPIE Electronic Imaging.

[41]  M. Pinsk,et al.  Push-pull mechanism of selective attention in human extrastriate cortex. , 2004, Journal of neurophysiology.

[42]  Gianluigi Ciocca,et al.  Good 50x70 Project: A portal for Cultural And Social Campaigns , 2014 .

[43]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[44]  Allan Kardec Barros,et al.  Measuring Streetscape Complexity Based on the Statistics of Local Contrast and Spatial Frequency , 2014, PloS one.

[45]  N Lavie,et al.  The role of perceptual load in negative priming. , 1998, Journal of experimental psychology. Human perception and performance.

[46]  G. Woodman,et al.  Neural fate of ignored stimuli: dissociable effects of perceptual and working memory load , 2004, Nature Neuroscience.

[47]  Samy Bengio,et al.  Torch: a modular machine learning software library , 2002 .

[48]  Rama Chellappa,et al.  Entropy rate superpixel segmentation , 2011, CVPR 2011.

[49]  Ardeshir Goshtasby,et al.  On the Canny edge detector , 2001, Pattern Recognit..

[50]  D. Beck,et al.  Blinded by the load: attention, awareness and the role of perceptual load , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[51]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Frank Goldhammer,et al.  Latent Factors Underlying Individual Differences in Attention Measures Perceptual and Executive Attention , 2006 .

[53]  Miguel P. Eckstein,et al.  Can Peripheral Representations Improve Clutter Metrics on Complex Scenes? , 2016, NIPS.

[54]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[55]  N. Lavie,et al.  On the Efficiency of Visual Selective Attention: Efficient Visual Search Leads to Inefficient Distractor Rejection , 1997 .

[56]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.