Tiny images

The human visual system is remarkably tolerant to degradations in image resolution: in a scene recognition task, human performance is similar whether 32 × 32 color images or multi-mega pixel images are used. With small images, even object recognition and segmentation is performed robustly by the visual system, despite the object being unrecognizable in isolation. Motivated by these observations, we explore the space of 32 × 32 images using a database of 10 32× 32 color images gathered from the Internet using image search engines. Each image is loosely labeled with one of the 70, 399 non-abstract nouns in English, as listed in the Wordnet lexical database. Hence the image database represents a dense sampling of all object categories and scenes. With this dataset, we use nearest neighbor methods to perform object recognition across the 10 images.

[1]  Jitendra Malik,et al.  Shape matching and object recognition using low distortion correspondences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[2]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  Yee Whye Teh,et al.  Names and faces in the news , 2004, CVPR 2004.

[4]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[5]  D. Field,et al.  Estimates of the information content and dimensionality of natural scenes from proximity distributions. , 2007, Journal of the Optical Society of America. A, Optics, image science, and vision.

[6]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[7]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Pietro Perona,et al.  Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[10]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[11]  Abel G. Oliva,et al.  Gist of a scene , 2005 .

[12]  Anthony Hoogs,et al.  Object Boundary Detection in Images using a Semantic Ontology , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[13]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[14]  Pietro Perona,et al.  A Visual Category Filter for Google Images , 2004, ECCV.

[15]  B Julesz,et al.  Masking in Visual Recognition: Effects of Two-Dimensional Filtered Noise , 1973, Science.

[16]  Trevor Darrell,et al.  Pyramid Match Hashing: Sub-Linear Time Indexing Over Partial Correspondences , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  T. Bachmann Identification of spatially quantised tachistoscopic images of faces: How many pixels does it take to carry identity? , 1991 .

[19]  David A. Forsyth,et al.  Animals on the Web , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20]  J. Wolfe Visual memory: What do you know about what you saw? , 1998, Current Biology.

[21]  Cordelia Schmid,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[22]  A. Oliva,et al.  Diagnostic Colors Mediate Scene Recognition , 2000, Cognitive Psychology.

[23]  M. Potter Short-term conceptual memory for pictures. , 1976, Journal of experimental psychology. Human learning and memory.

[24]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.