Which is the best way to organize/classify images by content?

Thousands of images are generated every day, which implies the necessity to classify, organise and access them using an easy, faster and efficient way. Scene classification, the classification of images into semantic categories (e.g. coast, mountains and streets), is a challenging and important problem nowadays. Many different approaches concerning scene classification have been proposed in the last few years. This article presents a detailed review of some of the most commonly used scene classification approaches. Furthermore, the surveyed techniques have been tested and their accuracy evaluated. Comparative results are shown and discussed giving the advantages and disadvantages of each methodology.

[1]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Alexei A. Efros,et al.  Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[3]  Jiebo Luo,et al.  Indoor vs outdoor classification of consumer photographs using low-level and semantic features , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[4]  Shih-Fu Chang,et al.  A knowledge engineering approach for image classification based on probabilistic reasoning systems , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[5]  Bo Zhang,et al.  Learning in Region-Based Image Retrieval , 2003, CIVR.

[6]  John R. Smith,et al.  Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues , 2003, EURASIP J. Adv. Signal Process..

[7]  Thierry Pun,et al.  Content-based query of image databases: inspirations from text retrieval , 2000, Pattern Recognit. Lett..

[8]  Jiebo Luo,et al.  Review of the State of the Art in Semantic Scene Classification , 2002 .

[9]  Anil K. Jain,et al.  Image classification for content-based indexing , 2001, IEEE Trans. Image Process..

[10]  Matti Pietikäinen,et al.  Outex - new framework for empirical evaluation of texture analysis algorithms , 2002, Object recognition supported by user interaction for service robots.

[11]  Jiebo Luo,et al.  Improved scene classification using efficient low-level features and semantic cues , 2004, Pattern Recognit..

[12]  Robert Marti,et al.  Object and Scene Classification: what does a Supervised Approach Provide us? , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[13]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[14]  Luc Van Gool,et al.  Modeling scenes with local descriptors and latent aspects , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15]  Anil K. Jain,et al.  On image classification: city vs. landscape , 1998, Proceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173).

[16]  Edward Y. Chang,et al.  CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[17]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[18]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19]  Cordelia Schmid,et al.  A sparse texture representation using affine-invariant regions , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[20]  Joan Martí,et al.  Using appearance and context for outdoor scene object classification , 2005, IEEE International Conference on Image Processing 2005.

[21]  P. Perona,et al.  Rapid natural scene categorization in the near absence of attention , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Eero P. Simoncelli,et al.  A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients , 2000, International Journal of Computer Vision.

[23]  Anne H. H. Ngu,et al.  Semantic-Sensitive Classification for Large Image Libraries , 2005, 11th International Multimedia Modelling Conference.

[24]  Sabine Süsstrunk,et al.  Eigenregions for image classification , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Andrew Zisserman,et al.  Texture classification: are filter banks necessary? , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[26]  Anil K. Jain,et al.  Content-based hierarchical classification of vacation images , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[27]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[28]  S. Thorpe,et al.  Speed of processing in the human visual system , 1996, Nature.

[29]  Antonio Torralba,et al.  Semantic organization of scenes using discriminant structural templates , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[30]  Antonio Torralba,et al.  Scene-Centered Description from Spatial Envelope Properties , 2002, Biologically Motivated Computer Vision.

[31]  Jiebo Luo,et al.  A computational approach to determination of main subject regions in photographic images , 2004, Image Vis. Comput..

[32]  Gabriela Csurka,et al.  Adapted Vocabularies for Generic Visual Categorization , 2006, ECCV.

[33]  Jianping Fan,et al.  Statistical modeling and conceptualization of natural images , 2005, Pattern Recognit..

[34]  Aleksandra Mojsilovic,et al.  ISee: perceptual features for image library navigation , 2002, IS&T/SPIE Electronic Imaging.

[35]  Jiebo Luo,et al.  A Bayesian network-based framework for semantic image understanding , 2005, Pattern Recognit..

[36]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[37]  Alexei A. Efros,et al.  Using Multiple Segmentations to Discover Objects and their Extent in Image Collections , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[38]  James Ze Wang,et al.  SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[40]  Yee Whye Teh,et al.  Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes , 2004, NIPS.

[41]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[42]  Anil K. Jain,et al.  On image classification: city images vs. landscapes , 1998, Pattern Recognit..

[43]  Martin Szummer,et al.  Indoor-outdoor image classification , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[44]  Joan Batlle,et al.  Positioning an underwater vehicle through image mosaicking , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[45]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[46]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[47]  Pietro Perona,et al.  Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[48]  Julia Vogel,et al.  Semantic scene modeling and retrieval , 2004 .

[49]  Selim Aksoy,et al.  Learning bayesian classifiers for scene classification with a visual grammar , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[50]  Bernt Schiele,et al.  Natural Scene Retrieval Based on a Semantic Modeling Step , 2004, CIVR.

[51]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.