论文信息 - Image Memorability Using Diverse Visual Features and Soft Attention

Image Memorability Using Diverse Visual Features and Soft Attention

In this paper we present a method for still image memorability estimation. The proposed solution exploits feature maps extracted from two Convolutional Neural Networks pre-trained for object recognition and memorability estimation respectively. The feature maps are then enhanced using a soft attention mechanism in order to let the model focus on highly informative image regions for memorability estimation. Results achieved on a benchmark dataset demonstrate the effectiveness of the proposed method.

[1] Claire-Hélène Demarty,et al. Deep Learning for Predicting Image Memorability , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3] Bolei Zhou,et al. Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[4] Paolo Remagnino,et al. AMNet: Memorability Estimation with Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5] A. Torralba,et al. Intrinsic and extrinsic effects on image memorability , 2015, Vision Research.

[6] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[7] Raffay Hamid,et al. What makes an image popular? , 2014, WWW.

[8] Alessandro Rozza,et al. Learning Combinations of Activation Functions , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[9] Ruslan Salakhutdinov,et al. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.

[10] Antonio Torralba,et al. Understanding and Predicting Image Memorability at a Large Scale , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11] Antonio Torralba,et al. Understanding the Intrinsic Memorability of Images , 2011, NIPS.

[12] J. D. McGaugh,et al. A Novel Demonstration of Enhanced Memory Associated with Emotional Arousal , 1995, Consciousness and Cognition.

[13] Nicu Sebe,et al. How to Make an Image More Memorable?: A Deep Style Transfer Approach , 2017, ICMR.

[14] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[17] Matei Mancas,et al. Memorability of natural scenes: The role of attention , 2013, 2013 IEEE International Conference on Image Processing.

[18] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[19] Jürgen Schmidhuber,et al. LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[20] Raimondo Schettini,et al. Multiscale fully convolutional network for image saliency , 2018, J. Electronic Imaging.

[21] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[23] Timothy F. Brady,et al. Scene Memory Is More Detailed Than You Think : The Role of Categories in Visual Long-Term Memory , 2010 .

[24] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[25] Eli Shechtman,et al. Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26] W. Pirie. Spearman Rank Correlation Coefficient , 2006 .

[27] Jianxiong Xiao,et al. What makes an image memorable? , 2011, CVPR 2011.

[28] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.