Fixation prediction for advertising images: Dataset and benchmark

Abstract Existing saliency prediction methods focus on exploring a universal saliency model for natural images, relatively few on advertising images which typically consists of both textual regions and pictorial regions. To fill this gap, we first build an advertising image database, named ADD1000, recording 57 subjects’ eye movement data of 1000 ad images. Compared to natural images, advertising images contain more artificial scenarios and show stronger persuasiveness and deliberateness, while the impact of this scene heterogeneity on visual attention is rarely studied. Moreover, text elements and picture elements express closely related semantic information to highlight product or brand in ad images, while their respective contribution to visual attention is also less known. Motivated by these, we further propose a saliency prediction model for advertising images based on text enhanced learning (TEL-SP), which comprehensively considers the interplay between textual region and pictorial region. Extensive experiments on the ADD1000 database show that the proposed model outperforms existing state-of-the-art methods.

[1]  Katsumi Aoki,et al.  Recent development of flow visualization , 2004, J. Vis..

[2]  Tianming Liu,et al.  Predicting eye fixations using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Nicu Sebe,et al.  Human in Events: A Large-Scale Benchmark for Human-centric Video Analysis in Complex Events , 2020, ArXiv.

[4]  Ming-Ming Cheng,et al.  JCS: An Explainable COVID-19 Diagnosis System by Joint Classification and Segmentation , 2020, IEEE Transactions on Image Processing.

[5]  Nicolas Riche,et al.  Dynamic Saliency Models and Human Attention: A Comparative Study on Videos , 2012, ACCV.

[6]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[7]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Aimin Hao,et al.  From Semantic Categories to Fixations: A Novel Weakly-supervised Visual-auditory Saliency Detection Approach , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Zhibo Chen,et al.  Multi-task Learning-based All-in-one Collaboration Framework for Degraded Image Super-resolution , 2021, ACM Trans. Multim. Comput. Commun. Appl..

[10]  Wei Li,et al.  Improved image deblurring based on salient-region segmentation , 2013, Signal Process. Image Commun..

[11]  Christopher A. Dickinson,et al.  Spatial asymmetries in viewing and remembering scenes: Consequences of an attentional bias? , 2009, Attention, perception & psychophysics.

[12]  Franc Solina,et al.  Audience Measurement of Digital Signage: Quantitative Study in Real-World Environment Using Computer Vision , 2013, Interact. Comput..

[13]  Ali Borji,et al.  Boosting bottom-up and top-down visual features for saliency estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Xuelong Li,et al.  PixelLink: Detecting Scene Text via Instance Segmentation , 2018, AAAI.

[15]  O. Meur,et al.  Introducing context-dependent and spatially-variant viewing biases in saccadic models , 2016, Vision Research.

[16]  John K. Tsotsos,et al.  Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[17]  Licia Calvi,et al.  Subliminal advertising in shooter games: recognition effects of textual and pictorial brand logos , 2013, Int. J. Arts Technol..

[18]  L. Parker,et al.  Advertising Effects? An Elemental Experiment , 2018, Australasian Marketing Journal.

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  A. Kessous Nostalgia and brands: a sweet rather than a bitter cultural evocation of the past , 2015 .

[21]  Benjamin W. Tatler,et al.  Systematic tendencies in scene viewing , 2008 .

[22]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[23]  Vincent Dumoulin,et al.  Deconvolution and Checkerboard Artifacts , 2016 .

[24]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[25]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[26]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[27]  Chong Peng,et al.  Improved Robust Video Saliency Detection Based on Long-Term Spatial-Temporal Information , 2020, IEEE Transactions on Image Processing.

[28]  Hongyu Li,et al.  SDSP: A novel saliency detection method by combining simple priors , 2013, 2013 IEEE International Conference on Image Processing.

[29]  Ming-Ming Cheng,et al.  Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[32]  Mohammad Reza Daliri,et al.  Differences of eye movement pattern in natural and man-made scenes and image categorization with the help of these patterns. , 2016, Journal of integrative neuroscience.

[33]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  M. Sirgy,et al.  Value-Expressive versus Utilitarian Advertising Appeals: When and Why to Use Which Appeal , 1991 .

[35]  Huchuan Lu,et al.  Bayesian Saliency via Low and mid Level Cues , 2022 .

[36]  K. Rayner,et al.  Eye movements when viewing advertisements , 2013, Front. Psychol..

[37]  Xilin Chen,et al.  Advertisement evaluation using visual saliency based on foveated image , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[38]  J. Wesley Hutchinson,et al.  Visual attention in consumer settings , 2016 .

[39]  Patrick Cavanagh,et al.  Mobile computation: spatiotemporal integration of the properties of objects in motion. , 2008, Journal of vision.

[40]  Olivier Le Meur,et al.  Can we accurately predict where we look at paintings? , 2020, PloS one.

[41]  Shailee Jain,et al.  Saliency Prediction for Visual Regions of Interest with Applications in Advertising , 2016, VAAM/FFER@ICPR.

[42]  Matthias Bethge,et al.  Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet , 2014, ICLR.

[43]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[44]  Teng Yu,et al.  Full-reference Screen Content Image Quality Assessment by Fusing Multilevel Structure Similarity , 2020, ACM Trans. Multim. Comput. Commun. Appl..

[45]  Sen Jia,et al.  EML-NET: An Expandable Multi-Layer NETwork for Saliency Prediction , 2018, Image Vis. Comput..

[46]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[47]  Tengpeng Li,et al.  Re-thinking Co-Salient Object Detection , 2021, IEEE transactions on pattern analysis and machine intelligence.

[48]  Eakta Jain,et al.  Deepcomics: saliency estimation for comics , 2018, ETRA.

[49]  Taewon Suh,et al.  Effects of Design Elements in Magazine Advertisements , 2009, HCI.

[50]  Noel E. O'Connor,et al.  SalGAN: Visual Saliency Prediction with Generative Adversarial Networks , 2017, ArXiv.

[51]  Yun Liu,et al.  Rethinking Computer-Aided Tuberculosis Diagnosis , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[53]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[54]  Qi Zhao,et al.  SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[55]  Vineet Gandhi,et al.  Tidying Deep Saliency Prediction Architectures , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[56]  Ruigang Yang,et al.  Inferring Salient Objects from Human Fixations , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57]  Leon A. Gatys,et al.  Understanding Low- and High-Level Contributions to Fixation Prediction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[58]  M. Bannert,et al.  Construction and interference in learning from multiple representation , 2003 .

[59]  Haizhong Wang,et al.  The interplay of emotions, elaboration, and ambivalence on attitude–behavior consistency , 2016 .

[60]  Masaaki Kawahashi,et al.  Renovation of Journal of Visualization , 2010, J. Vis..

[61]  Rita Cucchiara,et al.  A deep multi-level network for saliency prediction , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[62]  Andrew J. Stewart,et al.  Integrating text and pictorial information: eye movements when looking at print advertisements. , 2001, Journal of experimental psychology. Applied.

[63]  Ming-Ming Cheng,et al.  SANet: A Slice-Aware Network for Pulmonary Nodule Detection , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[64]  Weisi Lin,et al.  Personality-Assisted Multi-Task Learning for Generic and Personalized Image Aesthetics Assessment , 2020, IEEE Transactions on Image Processing.

[65]  Rik Pieters,et al.  Attention Capture and Transfer in Advertising: Brand, Pictorial, and Text-Size Effects , 2004 .

[66]  R. Venkatesh Babu,et al.  DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations , 2015, IEEE Transactions on Image Processing.

[67]  C. Kennard,et al.  Using saccades as a research tool in the clinical neurosciences , 2004 .

[68]  Ali Borji,et al.  CAT2000: A Large Scale Fixation Dataset for Boosting Saliency Research , 2015, ArXiv.

[69]  Ling Shao,et al.  Salient Object Detection via Integrity Learning , 2021, ArXiv.

[70]  Garrison W. Cottrell,et al.  Understanding Convolution for Semantic Segmentation , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[71]  Wojciech Matusik,et al.  Eye Tracking for Everyone , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Ning Xu,et al.  YouTube-VOS: A Large-Scale Video Object Segmentation Benchmark , 2018, ArXiv.

[73]  Rita Cucchiara,et al.  Predicting Human Eye Fixations via an LSTM-Based Saliency Attentive Model , 2016, IEEE Transactions on Image Processing.

[74]  Xiaolin Hu,et al.  Predicting Eye Fixations With Higher-Level Visual Features , 2015, IEEE Transactions on Image Processing.

[75]  Darius Miniotas,et al.  Visualization of eye gaze data using heat maps , 2007 .

[76]  A. Mizuno,et al.  A change of the leading player in flow Visualization technique , 2006, J. Vis..

[77]  M. Just,et al.  Eye fixations and cognitive processes , 1976, Cognitive Psychology.

[78]  Rainer Stiefelhagen,et al.  Quaternion-Based Spectral Saliency Detection for Eye Fixation Prediction , 2012, ECCV.

[79]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[80]  Weisi Lin,et al.  A Dilated Inception Network for Visual Saliency Prediction , 2019, IEEE Transactions on Multimedia.

[81]  Matthias Bethge,et al.  DeepGaze II: Reading fixations from deep features trained on object recognition , 2016, ArXiv.

[82]  Ling Shao,et al.  Kaleido-BERT: Vision-Language Pre-training on Fashion Domain , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[83]  Ming-Ming Cheng,et al.  EGNet: Edge Guidance Network for Salient Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[84]  Frédo Durand,et al.  What Do Different Evaluation Metrics Tell Us About Saliency Models? , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[85]  Maciej Pajak,et al.  Object-based saccadic selection during scene perception: evidence from viewing position effects. , 2013, Journal of vision.

[86]  Hong Qin,et al.  Depth-Quality-Aware Salient Object Detection , 2021, IEEE Transactions on Image Processing.

[87]  The Psychological Impact of Advertising on the Customer Behavior Communications of the IBIMA , 2008 .

[88]  Yuxi Li,et al.  A regional distance regression network for monocular object distance estimation , 2021, J. Vis. Commun. Image Represent..

[89]  Weisi Lin,et al.  Integrating visual saliency and consistency for re-ranking image search results , 2011, 2010 IEEE International Conference on Image Processing.

[90]  Michael Dorr,et al.  Large-Scale Optimization of Hierarchical Features for Saliency Prediction in Natural Images , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.