Understanding the Intrinsic Memorability of Images

Artists, advertisers, and photographers are routinely presented with the task of creating an image that a viewer will remember. While it may seem like image memorability is purely subjective, recent work shows that it is not an inexplicable phenomenon: variation in memorability of images is consistent across subjects, suggesting that some images are intrinsically more memorable than others, independent of a subjects' contexts and biases. In this paper, we used the publicly available memorability dataset of Isola et al. [13], and augmented the object and scene annotations with interpretable spatial, content, and aesthetic image properties. We used a feature-selection scheme with desirable explaining-away properties to determine a compact set of attributes that characterizes the memorability of any individual image. We find that images of enclosed spaces containing people with visible faces are memorable, while images of vistas and peaceful scenes are not. Contrary to popular belief, unusual or aesthetically pleasing scenes do not tend to be highly memorable. This work represents one of the first attempts at understanding intrinsic image memorability, and opens a new domain of investigation at the interface between human cognition and computer vision.

[1]  I. Rock,et al.  A study of memory for visual form. , 1959 .

[2]  L. Standing Learning 10000 pictures , 1973 .

[3]  L. Standing Learning 10,000 pictures. , 1973, The Quarterly journal of experimental psychology.

[4]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[5]  James L. McClelland,et al.  Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.

[6]  R. Shiffrin,et al.  A model for recognition memory: REM—retrieving effectively from memory , 1997, Psychonomic bulletin & review.

[7]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[8]  Erik Reinhard,et al.  Artistic Composition for Image Creation , 2001, Rendering Techniques.

[9]  Marc W. Howard,et al.  A distributed representation of temporal context , 2002 .

[10]  Michel Vidal-Naquet,et al.  Visual features of intermediate complexity and their use in classification , 2002, Nature Neuroscience.

[11]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[12]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[13]  Andreas Krause,et al.  Near-optimal Nonmyopic Value of Information in Graphical Models , 2005, UAI.

[14]  O. Sorkine,et al.  Color harmonization , 2006, SIGGRAPH 2006.

[15]  J. Worthen,et al.  Distinctiveness and memory. , 2006 .

[16]  Gordon D. A. Brown,et al.  A temporal ratio model of memory. , 2007, Psychological review.

[17]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[18]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[19]  Pietro Perona,et al.  Some Objects Are More Equal Than Others: Measuring and Predicting Importance , 2008, ECCV.

[20]  Xiaoou Tang,et al.  Photo and Video Quality Evaluation: Focusing on the Subject , 2008, ECCV.

[21]  Aude Oliva,et al.  Visual long-term memory has a massive storage capacity for object details , 2008, Proceedings of the National Academy of Sciences.

[22]  Dani Lischinski,et al.  Data-driven enhancement of facial attractiveness , 2008, ACM Trans. Graph..

[23]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[25]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Timothy F. Brady,et al.  Conceptual Distinctiveness Supports Detailed Visual Long-term Memory for Real-world Objects the Fidelity of Long-term Memory for Visual Information , 2022 .

[27]  Timothy F. Brady,et al.  Scene Memory Is More Detailed Than You Think : The Role of Categories in Visual Long-Term Memory , 2010 .

[28]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Daniel Cohen-Or,et al.  Optimizing Photo Composition , 2010, Comput. Graph. Forum.

[30]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[31]  Vicente Ordonez,et al.  High level describable attributes for predicting aesthetics and interestingness , 2011, CVPR 2011.

[32]  Jianxiong Xiao,et al.  What makes an image memorable , 2011 .

[33]  Abhimanyu Das,et al.  Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection , 2011, ICML.