Tiny images

The human visual system is remarkably tolerant to degradations in image resolution: in a scene recognition task, human performance is similar whether 32 × 32 color images or multi-mega pixel images are used. With small images, even object recognition and segmentation is performed robustly by the visual system, despite the object being unrecognizable in isolation. Motivated by these observations, we explore the space of 32 × 32 images using a database of 10 32× 32 color images gathered from the Internet using image search engines. Each image is loosely labeled with one of the 70, 399 non-abstract nouns in English, as listed in the Wordnet lexical database. Hence the image database represents a dense sampling of all object categories and scenes. With this dataset, we use nearest neighbor methods to perform object recognition across the 10 images.

[1]  B Julesz,et al.  Masking in Visual Recognition: Effects of Two-Dimensional Filtered Noise , 1973, Science.

[2]  M. Potter Short-term conceptual memory for pictures. , 1976, Journal of experimental psychology. Human learning and memory.

[3]  T. Bachmann Identification of spatially quantised tachistoscopic images of faces: How many pixels does it take to carry identity? , 1991 .

[4]  J. Wolfe Visual memory: What do you know about what you saw? , 1998, Current Biology.

[5]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[6]  A. Oliva,et al.  Diagnostic Colors Mediate Scene Recognition , 2000, Cognitive Psychology.

[7]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[8]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[9]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10]  Yee Whye Teh,et al.  Names and faces in the news , 2004, CVPR 2004.

[11]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[12]  Pietro Perona,et al.  A Visual Category Filter for Google Images , 2004, ECCV.

[13]  Jitendra Malik,et al.  Shape matching and object recognition using low distortion correspondences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Pietro Perona,et al.  Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[16]  Abel G. Oliva,et al.  Gist of a scene , 2005 .

[17]  Cordelia Schmid,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[18]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  Anthony Hoogs,et al.  Object Boundary Detection in Images using a Semantic Ontology , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[20]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  David A. Forsyth,et al.  Animals on the Web , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[23]  D. Field,et al.  Estimates of the information content and dimensionality of natural scenes from proximity distributions. , 2007, Journal of the Optical Society of America. A, Optics, image science, and vision.

[24]  Trevor Darrell,et al.  Pyramid Match Hashing: Sub-Linear Time Indexing Over Partial Correspondences , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.