论文信息 - On the use of supervised features for unsupervised image categorization: An evaluation

On the use of supervised features for unsupervised image categorization: An evaluation

Abstract Recently, new high-level features have been proposed to describe the semantic content of images. These features, that we call supervised, are obtained by exploiting the information provided by an additional set of labeled images. Supervised features were successfully used in the context of image classification and retrieval, where they showed excellent results. In this paper, we will demonstrate that they can be effectively used also for unsupervised image categorization, that is, for grouping semantically similar images. We have experimented different state-of-the-art clustering algorithms on various standard data sets commonly used for supervised image classification evaluations. We have compared the results obtained by using four supervised features (namely, classemes, prosemantic features, object bank, and a feature obtained from a Canonical Correlation Analysis) against those obtained by using low-level features. The results show that supervised features exhibit a remarkable expressiveness which allows to effectively group images into the categories defined by the data sets’ authors.

[1] H. Hotelling. Relations Between Two Sets of Variates , 1936 .

[2] Thomas Sikora,et al. The MPEG-7 visual standard for content description-an overview , 2001, IEEE Trans. Circuits Syst. Video Technol..

[3] Ying Liu,et al. A survey of content-based image retrieval with high-level semantics , 2007, Pattern Recognit..

[4] Michele Covell,et al. Comparison of clustering approaches for summarizing large populations of images , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[5] Pietro Perona,et al. Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[6] Vladimir Pestov,et al. On the geometry of similarity search: Dimensionality curse and concentration of measure , 1999, Inf. Process. Lett..

[7] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] Shuicheng Yan,et al. Attribute feedback , 2012, ACM Multimedia.

[9] Krista A. Ehinger,et al. SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10] Gang Wang,et al. Learning image similarity from Flickr groups using Stochastic Intersection Kernel MAchines , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11] John R. Smith,et al. Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[12] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[13] Michal Irani,et al. “Clustering by Composition”—Unsupervised Discovery of Image Categories , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Greg Hamerly,et al. Alternatives to the k-means algorithm that find better clusterings , 2002, CIKM '02.

[15] Trevor Darrell,et al. Unsupervised Learning of Categories from Sets of Partially Matching Image Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[16] Nuno Vasconcelos,et al. Scene classification with low-dimensional semantic spaces and weak supervision , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17] James Ze Wang,et al. SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[18] Wei-Ying Ma,et al. Locality preserving clustering for image database , 2004, MULTIMEDIA '04.

[19] Dengxin Dai,et al. Discovering scene categories by information projection and cluster sampling , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20] Marc'Aurelio Ranzato,et al. Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[21] Willem Stuursma. Image classification using ROIs and Multiple Kernel Learning , 2009 .

[22] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[23] Pietro Perona,et al. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[24] Ali Farhadi,et al. Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[25] Andrew W. Fitzgibbon,et al. Efficient Object Category Recognition Using Classemes , 2010, ECCV.

[26] Yiannis S. Boutalis,et al. CEDD: Color and Edge Directivity Descriptor: A Compact Descriptor for Image Indexing and Retrieval , 2008, ICVS.

[27] Hao Su,et al. Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[28] Antonio Torralba,et al. Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[29] Ali Farhadi,et al. Scene Discovery by Matrix Factorization , 2008, ECCV.

[30] Shawn D. Newsam,et al. Bag-of-visual-words and spatial extensions for land-use classification , 2010, GIS '10.

[31] Alexei A. Efros,et al. Automatic photo pop-up , 2005, SIGGRAPH 2005.

[32] Fei-Fei Li,et al. What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[33] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[34] Bernt Schiele,et al. International Journal of Computer Vision manuscript No. (will be inserted by the editor) Semantic Modeling of Natural Scenes for Content-Based Image Retrieval , 2022 .

[35] Ulrike von Luxburg,et al. A tutorial on spectral clustering , 2007, Stat. Comput..

[36] Rongrong Ji,et al. Weak attributes for large-scale image retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[37] Christoph H. Lampert,et al. Unsupervised Object Discovery: A Comparison , 2010, International Journal of Computer Vision.

[38] Raimondo Schettini,et al. Halfway through the semantic gap: Prosemantic features for image retrieval , 2011, Inf. Sci..

[39] Brendan J. Frey,et al. Non-metric affinity propagation for unsupervised image categorization , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[40] Gerhard Sagerer,et al. Comparing Clustering Methods for Database Categorization in Image Retrieval , 2003, DAGM-Symposium.

[41] David Nistér,et al. Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[42] Yue Gao,et al. Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval , 2013, ACM Multimedia.

[43] Raimondo Schettini,et al. Prosemantic Features for Content-Based Image Retrieval , 2009, Adaptive Multimedia Retrieval.

[44] Alexei A. Efros,et al. Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[45] Cordelia Schmid,et al. Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[46] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[47] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[48] Ernest Valveny,et al. Leveraging category-level labels for instance-level image retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[49] Nicu Sebe,et al. Complex Event Detection via Multi-source Video Attributes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[50] Barbara Caputo,et al. Recognition with local features: the kernel recipe , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[51] Sebastian Nowozin,et al. On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[52] Jitendra Malik,et al. Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[53] Luc Van Gool,et al. Ensemble Partitioning for Unsupervised Image Categorization , 2012, ECCV.

[54] J. H. Ward. Hierarchical Grouping to Optimize an Objective Function , 1963 .

[55] Yong Jae Lee,et al. Foreground Focus: Unsupervised Learning from Partially Matching Images , 2009, International Journal of Computer Vision.

[56] Delbert Dueck,et al. Clustering by Passing Messages Between Data Points , 2007, Science.

[57] T. Landauer,et al. Indexing by Latent Semantic Analysis , 1990 .

[58] Christoph H. Lampert,et al. Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[59] Alexei A. Efros,et al. Unsupervised discovery of visual object class hierarchies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[60] Barbara Caputo,et al. Class-Specific Material Categorisation , 2005, ICCV.

[61] Ting Liu,et al. Clustering Billions of Images with Large Scale Nearest Neighbor Search , 2007, 2007 IEEE Workshop on Applications of Computer Vision (WACV '07).

[62] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[63] Jeff Z. Pan,et al. Multimedia annotations on the semantic Web , 2006, IEEE Multimedia.

[64] Thomas Mensink,et al. Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.