Visual complexity analysis using deep intermediate-layer features

Abstract In this paper, we focus on visual complexity, an image attribute that humans can subjectively evaluate based on the level of details in the image. We explore unsupervised information extraction from intermediate convolutional layers of deep neural networks to measure visual complexity. We derive an activation energy metric that combines convolutional layer activations to quantify visual complexity. To show the effectiveness of our proposed metric for various applications, we introduce Savoias , a visual complexity dataset that compromises of more than 1,400 images from seven diverse image categories (e.g., advertisement and interior design). We demonstrate high correlations of our deep neural network-based measure of visual complexity with human-curated ground-truth (GT) scores on various widely used network architectures, e.g., VGG16, ResNet-v2-152, and EfficientNet, and in networks trained on two classification tasks (object and scene classification). This result reveals that intermediate convolutional layers of deep neural networks carry information about the complexity of images that is meaningful to people. Furthermore, we show that our method of measuring visual complexity outperforms traditional methods on Savoias and two other state-of-the-art benchmark datasets. Moreover, we perform extensive analysis on the performance difference between our unsupervised method and a supervised method trained on the feature map, and show that by supervision, we can improve the prediction. Finally, we demonstrate that, within the context of a category, visually more complex images are also more memorable to human observers.

[1]  A. Krishen Perceived Versus Actual Complexity for Websites: Their Relationship to Consumer Satisfaction , 2008 .

[2]  M. Kendall,et al.  ON THE METHOD OF PAIRED COMPARISONS , 1940 .

[3]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[4]  Abhinav Gupta,et al.  Transferring Rich Feature Hierarchies for Robust Visual Tracking , 2015, ArXiv.

[5]  Helmut Leder,et al.  Predicting perceived visual complexity of abstract patterns using computational measures: The influence of mirror symmetry on complexity perception , 2017, PloS one.

[6]  R. A. Bradley,et al.  Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons , 1952 .

[7]  Lei Guo,et al.  BUOCA: Budget-Optimized Crowd Worker Allocation , 2019, ArXiv.

[8]  Barbara Seegebarth,et al.  The Impact of Perceived Visual Complexity of Mobile Online Shops on User's Satisfaction , 2017 .

[9]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Yuanzhen Li,et al.  Measuring visual clutter. , 2007, Journal of vision.

[11]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[12]  Mingda Zhang,et al.  Automatic Understanding of Image and Video Advertisements , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Helmut Leder,et al.  The small step toward asymmetry: Aesthetic judgment of broken symmetries , 2013, i-Perception.

[14]  Gianluigi Ciocca,et al.  Predicting Complexity Perception of Real World Images , 2016, PloS one.

[15]  Raimondo Schettini,et al.  No reference image quality classification for JPEG-distorted images , 2014, Digit. Signal Process..

[16]  Larry S. Davis,et al.  Exploiting local features from deep networks for image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  Thomas Jacobsen,et al.  Aesthetics Electrified: An Analysis of Descriptive Symmetry and Evaluative Aesthetic Judgment Processes Using Event-Related Brain Potentials , 2001 .

[18]  Albert Gordo,et al.  End-to-End Learning of Deep Visual Representations for Image Retrieval , 2016, International Journal of Computer Vision.

[19]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  D. Berlyne,et al.  Aesthetics and Psychobiology , 1975 .

[21]  Hans J. Eysenck,et al.  The empirical determination of an aesthetic formula. , 1941 .

[22]  Atsuto Maki,et al.  Visual Instance Retrieval with Deep Convolutional Networks , 2014, ICLR.

[23]  Yuhua Qian,et al.  Assessment model for perceived visual complexity of painting images , 2018, Knowl. Based Syst..

[24]  Marco Bertamini,et al.  Examining visual complexity and its influence on perceived duration. , 2014, Journal of vision.

[25]  Adam Finkelstein,et al.  Automatic triage for a photo series , 2016, ACM Trans. Graph..

[26]  Hanspeter Pfister,et al.  What Makes a Visualization Memorable? , 2013, IEEE Transactions on Visualization and Computer Graphics.

[27]  Michel Wedel,et al.  The Stopping Power of Advertising: Measures and Effects of Visual Complexity , 2010 .

[28]  Diana L. Haytko,et al.  It’s all at the mall: exploring adolescent girls’ experiences , 2004 .

[29]  Katharina Reinecke,et al.  Predicting users' first impressions of website aesthetics with a quantification of perceived visual complexity and colorfulness , 2013, CHI.

[30]  Joan Bruna,et al.  Super-Resolution with Deep Convolutional Sufficient Statistics , 2015, ICLR.

[31]  Yizhou Yu,et al.  Visual saliency based on multiscale deep features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Wilma A. Bainbridge,et al.  The intrinsic memorability of face photographs. , 2013, Journal of experimental psychology. General.

[33]  Antonino Santos,et al.  Computerized measures of visual complexity. , 2015, Acta psychologica.

[34]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[35]  C. Heaps,et al.  Similarity and Features of Natural Textures , 1999 .

[36]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[37]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[38]  Maria Petrou,et al.  Attentional vs computational complexity measures in observing paintings. , 2009, Spatial vision.

[39]  Albert Gordo,et al.  Deep Image Retrieval: Learning Global Representations for Image Search , 2016, ECCV.

[40]  Alan C. Bovik,et al.  A Statistical Evaluation of Recent Full Reference Image Quality Assessment Algorithms , 2006, IEEE Transactions on Image Processing.

[41]  Lihi Zelnik-Manor,et al.  Is Image Memorability Prediction Solved? , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[42]  Alexei A. Efros,et al.  What makes ImageNet good for transfer learning? , 2016, ArXiv.

[43]  Stella X. Yu,et al.  Image Quality Assessment by Comparing CNN Features between Images , 2016 .

[44]  Hanspeter Pfister,et al.  Beyond Memorability: Visualization Recognition and Recall , 2016, IEEE Transactions on Visualization and Computer Graphics.

[45]  A. Torralba,et al.  Intrinsic and extrinsic effects on image memorability , 2015, Vision Research.

[46]  Yili Liu,et al.  Computational modeling and experimental investigation of effects of compositional elements on interface and design aesthetics , 2006, Int. J. Hum. Comput. Stud..

[47]  Frédo Durand,et al.  Learning Visual Importance for Graphic Designs and Data Visualizations , 2017, UIST.

[48]  Arzu Çöltekin,et al.  Measured and perceived visual complexity: a comparative study among three online map providers , 2018 .

[49]  C. Cela-Conde,et al.  Predicting beauty: fractal dimension and visual complexity in art. , 2011, British journal of psychology.

[50]  S F Chipman,et al.  Complexity and structure in visual patterns. , 1977, Journal of experimental psychology. General.

[51]  Prateek Gupta,et al.  A modified PSNR metric based on HVS for quality assessment of color images , 2011, 2011 International Conference on Communication and Industrial Application.

[52]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[53]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[54]  J. G. Snodgrass,et al.  A standardized set of 260 pictures: norms for name agreement, image agreement, familiarity, and visual complexity. , 1980, Journal of experimental psychology. Human learning and memory.

[55]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS THE METHOD OF PAIRED COMPARISONS , 1952 .

[56]  Helmut Leder,et al.  Effects of presentation duration on measures of complexity in affective environmental scenes and representational paintings. , 2016, Acta psychologica.

[57]  Iasonas Kokkinos,et al.  Deep Filter Banks for Texture Recognition, Description, and Segmentation , 2015, International Journal of Computer Vision.

[58]  K. Arrow A Difficulty in the Concept of Social Welfare , 1950, Journal of Political Economy.

[59]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[60]  Fei Gao,et al.  DeepSim: Deep similarity for image quality assessment , 2017, Neurocomputing.

[61]  Marcos Nadal,et al.  Visual Complexity and Beauty Appreciation: Explaining the Divergence of Results , 2010 .

[62]  Cordelia Schmid,et al.  Convolutional Patch Representations for Image Retrieval: An Unsupervised Approach , 2016, International Journal of Computer Vision.

[63]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.