A Bayesian hierarchical model for learning natural scene categories

We propose a novel approach to learn and recognize natural scene categories. Unlike previous work, it does not require experts to annotate the training set. We represent the image of a scene by a collection of local regions, denoted as codewords obtained by unsupervised learning. Each region is represented as part of a "theme". In previous work, such themes were learnt from hand-annotations of experts, while our method learns the theme distributions as well as the codewords distribution over the themes without supervision. We report satisfactory categorization performances on a large set of 13 categories of complex scenes.

[1]  Thomas S. Huang,et al.  Image processing , 1971 .

[2]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[3]  Rosalind W. Picard,et al.  Texture orientation for sorting photos "at a glance" , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[4]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[5]  S. Thorpe,et al.  Speed of processing in the human visual system , 1996, Nature.

[6]  Martin Szummer,et al.  Indoor-outdoor image classification , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[7]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[8]  Anil K. Jain,et al.  Image classification for content-based indexing , 2001, IEEE Trans. Image Process..

[9]  Tom Minka,et al.  Expectation-Propogation for the Generative Aspect Model , 2002, UAI.

[10]  P. Perona,et al.  Rapid natural scene categorization in the near absence of attention , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Andrew Zisserman,et al.  Texture classification: are filter banks necessary? , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[12]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[13]  John M. Winn,et al.  Variational Message Passing and its Applications , 2004 .

[14]  Bernt Schiele,et al.  A Semantic Typicality Measure for Natural Scene Categorization , 2004, DAGM-Symposium.

[15]  Michael Brady,et al.  Saliency, Scale and Image Description , 2001, International Journal of Computer Vision.

[16]  Eero P. Simoncelli,et al.  A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients , 2000, International Journal of Computer Vision.

[17]  Antonio Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, CVPR 2004.

[18]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[19]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[20]  Jerry Nedelman,et al.  Book review: “Bayesian Data Analysis,” Second Edition by A. Gelman, J.B. Carlin, H.S. Stern, and D.B. Rubin Chapman & Hall/CRC, 2004 , 2005, Comput. Stat..