A preliminary benchmark of four saliency algorithms on comic art

Predicting the salient regions of a comic book panel has the potential to drive a variety of applications such as segmentation, cropping, effects such as moves on stills, etc. Computational saliency algorithms have been widely tested on a variety of natural images, and extensively benchmarked. We report the performance of four saliency algorithms on a set of comic panels taken from public domain legacy comics. We find that a data-driven method performs highest based on two metrics, Normalized Scanpath Saliency and Area Under the Curve. We discuss possible reasons for this finding based on an exploratory analysis of the similarity between the comic images in our dataset and images used in the dataset of the data driven method.

[1]  Matthias Bethge,et al.  Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet , 2014, ICLR.

[2]  Ali Borji,et al.  What stands out in a scene? A study of human explicit saliency judgment , 2013, Vision Research.

[3]  Ali Borji,et al.  CAT2000: A Large Scale Fixation Dataset for Boosting Saliency Research , 2015, ArXiv.

[4]  William T. Freeman,et al.  Presented at: 2nd Annual IEEE International Conference on Image , 1995 .

[5]  Yaser Sheikh,et al.  Inferring artistic intention in comic art through viewer gaze , 2012, SAP.

[6]  Ali Borji,et al.  State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  R. Rosenholtz A simple saliency model predicts a number of motion popout phenomena , 1999, Vision Research.

[8]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[9]  Tianming Liu,et al.  Predicting eye fixations using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Frédo Durand,et al.  A Benchmark of Computational Models of Saliency to Predict Human Fixations , 2012 .

[11]  Majid Nili Ahmadabadi,et al.  Cost-sensitive learning of top-down modulation for attentional control , 2009, Machine Vision and Applications.

[12]  Dana H. Ballard,et al.  Eye Movements for Reward Maximization , 2003, NIPS.

[13]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[15]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[16]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[17]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[18]  R. Venkatesh Babu,et al.  DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations , 2015, IEEE Transactions on Image Processing.

[19]  Ali Borji,et al.  Computational models of attention , 2015, ArXiv.

[20]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[21]  Mubarak Shah,et al.  Visual attention detection in video sequences using spatiotemporal cues , 2006, MM '06.

[22]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[23]  Qi Zhao,et al.  SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[25]  Rajesh P. N. Rao,et al.  Eye movements in iconic visual search , 2002, Vision Research.

[26]  Asha Iyer,et al.  Components of bottom-up gaze allocation in natural images , 2005, Vision Research.

[27]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[28]  Simone Frintrop,et al.  Traditional saliency reloaded: A good old model in new shape , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Christof Koch,et al.  Modeling attention to salient proto-objects , 2006, Neural Networks.