Categorization of natural scenes: Local versus global information and the role of color

Categorization of scenes is a fundamental process of human vision that allows us to efficiently and rapidly analyze our surroundings. Several studies have explored the processes underlying human scene categorization, but they have focused on processing global image information. In this study, we present both psychophysical and computational experiments that investigate the role of local versus global image information in scene categorization. In a first set of human experiments, categorization performance is tested when only local or only global image information is present. Our results suggest that humans rely on local, region-based information as much as on global, configural information. In addition, humans seem to integrate both types of information for intact scene categorization. In a set of computational experiments, human performance is compared to two state-of-the-art computer vision approaches that have been shown to be psychophysically plausible and that model either local or global information. In addition to the influence of local versus global information, in a second series of experiments, we investigated the effect of color on the categorization performance of both the human observers and the computational model. Analysis of the human data suggests that color is an additional channel of perceptual information that leads to higher categorization results at the expense of increased reaction times in the intact condition. However, it does not affect reaction times when only local information is present. When color is removed, the employed computational model follows the relative performance decrease of human observers for each scene category and can thus be seen as a perceptually plausible model for human scene categorization based on local image information.

[1]  I. Biederman Perceiving Real-World Scenes , 1972, Science.

[2]  E. Rosch,et al.  Structural bases of typicality effects. , 1976 .

[3]  B. Tversky,et al.  Categories of environmental scenes , 1983, Cognitive Psychology.

[4]  A. Oliva,et al.  From Blobs to Boundary Edges: Evidence for Time- and Spatial-Scale-Dependent Scene Recognition , 1994 .

[5]  Rangachar Kasturi,et al.  Machine vision , 1995 .

[6]  Denis Fize,et al.  Speed of processing in the human visual system , 1996, Nature.

[7]  C. Moorehead All rights reserved , 1997 .

[8]  Martin Szummer,et al.  Indoor-outdoor image classification , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[9]  Charles A. Bouman,et al.  Perceptual image similarity experiments , 1998, Electronic Imaging.

[10]  J. Wolfe Visual memory: What do you know about what you saw? , 1998, Current Biology.

[11]  G Richard,et al.  Ultra-rapid categorisation of natural scenes does not rely on colour cues: a study in monkeys and humans , 2000, Vision Research.

[12]  A. Oliva,et al.  Diagnostic Colors Mediate Scene Recognition , 2000, Cognitive Psychology.

[13]  Anil K. Jain,et al.  Image classification for content-based indexing , 2001, IEEE Trans. Image Process..

[14]  K. Gegenfurtner,et al.  The contributions of color to recognition memory for natural scenes. , 2002, Journal of experimental psychology. Learning, memory, and cognition.

[15]  Adrian Schwaninger,et al.  Role of Featural and Configural Information in Familiar and Unfamiliar Face Recognition , 2002, Biologically Motivated Computer Vision.

[16]  Adrian Schwaninger,et al.  Expert face processing: specialisation and constraints , 2003 .

[17]  W. Hayward After the viewpoint debate: where next in object recognition? , 2003, Trends in Cognitive Sciences.

[18]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[19]  Antonio Torralba,et al.  Contextual Models for Object Detection Using Boosted Random Fields , 2004, NIPS.

[20]  Aleksandra Mojsilovic,et al.  Semantic-Friendly Indexing and Quering of Images Based on the Extraction of the Objective Semantic Cues , 2004, International Journal of Computer Vision.

[21]  Jitendra Malik,et al.  When is scene identification just texture recognition? , 2004, Vision Research.

[22]  J. Henderson Introduction to real-world scene perception , 2005 .

[23]  Christian Wallraven,et al.  Learning from humans: Computational modeling of face recognition , 2004, Network.

[24]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25]  Abel G. Oliva,et al.  Gist of a scene , 2005 .

[26]  Paul T. Sowden,et al.  The use of visual information in natural scenes , 2005 .

[27]  P. Perona,et al.  Why does natural scene categorization require little attention? Exploring attentional requirements for natural and synthetic stimuli , 2005 .

[28]  A. Oliva,et al.  Diagnostic colours contribute to the early stages of scene categorization: Behavioural and neurophysiological evidence , 2005 .

[29]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[30]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[31]  Bernt Schiele,et al.  A psychophysically plausible model for typicality ranking of natural scenes , 2006, TAP.

[32]  Bernt Schiele,et al.  International Journal of Computer Vision manuscript No. (will be inserted by the editor) Semantic Modeling of Natural Scenes for Content-Based Image Retrieval , 2022 .

[33]  P. Perona,et al.  What do we perceive in a glance of a real-world scene? , 2007, Journal of vision.

[34]  G. Rhodes,et al.  An own-race advantage for components as well as configurations in face recognition , 2008, Cognition.