Representation Models and Machine Learning Techniques for Scene Classificatio

Scene classification is a fundamental process of human vision that allows us to efficiently and rapidly analyze our surroundings. Humans are able to recognize complex visual scenes at a single glance, despite the number of objects with different poses, colors, shadows and textures that may be contained in the scenes. Understanding the robustness and rapidness of this human ability has been a focus of investigation in the cognitive sciences over many years. These studies have stimulated researches in computer vision in building artificial scene recognition systems. Motivations beyond that of pure scientific curiosity are provided by several important computer vision applications in which scene classification can be exploited (e.g., robot navigation systems). Different methods have been proposed to model and to describe the content of a scene. Different machine learning procedures have been employed to automatically learn commonalities and differences between different classes. In this chapter we survey some of the state of the art approaches for scene classification. For each approach we report a description and a discussion of the most relevant peculiarities.

[1]  Jitendra Malik,et al.  Recognizing surfaces using three-dimensional textons , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[2]  Kyeongok Kang,et al.  A compressed domain scheme for classifying block edge patterns , 2005, IEEE Transactions on Image Processing.

[3]  Giovanni Giuffrida,et al.  Exploiting visual and text features for direct marketing learning in time and space constrained domains , 2010, Pattern Analysis and Applications.

[4]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[5]  Frédéric Jurie,et al.  Fast Discriminative Visual Codebooks using Randomized Clustering Forests , 2006, NIPS.

[6]  Giovanni Giuffrida,et al.  Using visual and text features for direct marketing on multimedia messaging services domain , 2009, Multimedia Tools and Applications.

[7]  Pei-Yung Hsiao,et al.  Edge Detection on the Bayer Pattern , 2006, APCCAS 2006 - 2006 IEEE Asia Pacific Conference on Circuits and Systems.

[8]  Martin Szummer,et al.  Indoor-outdoor image classification , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[9]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[10]  Giovanni Maria Farinella,et al.  Scene categorization using bag of Textons on spatial hierarchy , 2008, 2008 15th IEEE International Conference on Image Processing.

[11]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[12]  Bernt Schiele,et al.  International Journal of Computer Vision manuscript No. (will be inserted by the editor) Semantic Modeling of Natural Scenes for Content-Based Image Retrieval , 2022 .

[13]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[15]  Giovanni Maria Farinella,et al.  Natural Versus Artificial Scene Classification by Ordering Discrete Fourier Power Spectra , 2008, SSPR/SPR.

[16]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[17]  Antonio Torralba,et al.  Depth Estimation from Image Structure , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  B. Julesz Textons, the elements of texture perception, and their interactions , 1981, Nature.

[19]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[20]  Andrew Zisserman,et al.  Scene Classification Using a Hybrid Generative/Discriminative Approach , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[22]  Rosalind W. Picard,et al.  Texture orientation for sorting photos "at a glance" , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[23]  Jitendra Malik,et al.  When is scene recognition just texture recognition , 2010 .

[24]  Guillermo Sapiro,et al.  Discriminative learned dictionaries for local image analysis , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[26]  Rastislav Lukac,et al.  Single-Sensor Imaging: Methods and Applications for Digital Cameras , 2008 .

[27]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[28]  R. Weale Vision. A Computational Investigation Into the Human Representation and Processing of Visual Information. David Marr , 1983 .

[29]  J. Wade Davis,et al.  Statistical Pattern Recognition , 2003, Technometrics.

[30]  Antonio Torralba,et al.  Small codes and large image databases for recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Antonio Torralba,et al.  Statistics of natural image categories , 2003, Network.

[32]  Raimondo Schettini,et al.  Improving Color Constancy Using Indoor–Outdoor Image Classification , 2008, IEEE Transactions on Image Processing.

[33]  Antonio Torralba,et al.  Statistical Context Priming for Object Detection , 2001, ICCV.

[34]  Jiebo Luo,et al.  Natural scene classification using overcomplete ICA , 2005, Pattern Recognit..

[35]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[36]  Anne Guérin-Dugué,et al.  Categorisation and Retrieval of Scene Photographs from JPEG Compressed Database , 2001, Pattern Analysis & Applications.

[37]  Antonio Torralba,et al.  Semantic organization of scenes using discriminant structural templates , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[38]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[39]  Sebastiano Battiato,et al.  Depth map generation by image classification , 2004, IS&T/SPIE Electronic Imaging.

[40]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[41]  Robert Marti,et al.  Which is the best way to organize/classify images by content? , 2007, Image Vis. Comput..

[42]  Jiebo Luo,et al.  Bayesian fusion of camera metadata cues in semantic scene classification , 2004, CVPR 2004.

[43]  Heinrich H. Bülthoff,et al.  Categorization of natural scenes: local vs. global information , 2006, APGV '06.

[44]  Bo Shen,et al.  Direct feature extraction from compressed images , 1996, Electronic Imaging.

[45]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.