Scene Modeling Using Co-Clustering

In this paper, we propose a novel approach for scene modeling. The proposed method is able to automatically discover the intermediate semantic concepts. We utilize Maximization of Mutual Information (MMI) co-clustering approach to discover clusters of semantic concepts, which we call intermediate concepts. Each intermediate concept corresponds to a cluster of visterms in the bag of Vis- terms (BOV) paradigm for scene classification. MMI co- clustering results in fewer but meaningful clusters. Unlike k-means which is used to cluster image patches based on their appearances in BOV, MMI co-clustering can group the visterms which are highly correlated to some concept. Unlike probabilistic latent semantic analysis (pLSA), which can be considered as one-sided soft clustering, MMI co- clustering simultaneously clusters visterms and images, so it is able to boost both clustering. In addition, the MMI co- clustering is an unsupervised method. We have extensively tested our proposed approach on two challenging datasets: the fifteen scene categories and the LSCOM dataset, and promising results are obtained.

[1]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Naftali Tishby,et al.  Document clustering using word clusters via the information bottleneck method , 2000, SIGIR '00.

[3]  Anil K. Jain,et al.  Image classification for content-based indexing , 2001, IEEE Trans. Image Process..

[4]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[5]  Shiri Gordon,et al.  Applying the information bottleneck principle to unsupervised clustering of discrete and continuous image representations , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[7]  Ata Kabán,et al.  Fast Extraction of Semantic Features from a Latent Semantic Indexed Text Corpus , 2004, Neural Processing Letters.

[8]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[9]  Andrew Zisserman,et al.  A Statistical Approach to Texture Classification from Single Images , 2004, International Journal of Computer Vision.

[10]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[11]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[12]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[13]  Bernt Schiele,et al.  Natural Scene Retrieval Based on a Semantic Modeling Step , 2004, CIVR.

[14]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Frédéric Jurie,et al.  Creating efficient codebooks for visual recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[17]  Alexei A. Efros,et al.  Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[18]  Luc Van Gool,et al.  Modeling scenes with local descriptors and latent aspects , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[19]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[20]  Silvio Savarese,et al.  Discriminative Object Class Models of Appearance and Shape by Correlatons , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Shih-Fu Chang,et al.  Video search reranking via information bottleneck principle , 2006, MM '06.

[22]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[23]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).