Unified tag analysis with multi-edge graph

Image tags have become a key intermediate vehicle to organize, index and search the massive online image repositories. Extensive research has been conducted on different yet related tag analysis tasks, e.g., tag refinement, tag-to-region assignment, and automatic tagging. In this paper, we propose a new concept of multi-edge graph, through which a unified solution is derived for the different tag analysis tasks. Specifically, each vertex of the graph is first characterized by a unique image. Then each image is encoded as a region bag with multiple image segmentations, and the thresholding of the pairwise similarities between regions naturally constructs the multiple edges between each vertex pair. The unified tag analysis is then generally described as the tag propagation between a vertex and its edges, as well as between all edges cross the entire image repository. We develop a core vertex-vs-edge tag equation unique for multi-edge graph to unify the image/vertex tag(s) and region-pair/edge tag(s). Finally, unified tag analysis is formulated as a constrained optimization problem, where the objective function characterizing the cross-patch tag consistency is constrained by the core equations for all vertex pairs, and the cutting plane method is used for efficient optimization. Extensive experiments on various tag analysis tasks over three widely used benchmark datasets validate the effectiveness of our proposed unified solution.

[1]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[2]  Mor Naaman,et al.  Why we tag: motivations for annotation in mobile and online media , 2007, CHI.

[3]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[4]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[5]  Alexei A. Efros,et al.  Using Multiple Segmentations to Discover Objects and their Extent in Image Collections , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Gang Chen,et al.  Semi-supervised Multi-label Learning by Solving a Sylvester Equation , 2008, SDM.

[7]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Shenghuo Zhu,et al.  Learning multiple graphs for document recommendations , 2008, WWW.

[9]  Tao Mei,et al.  Graph-based semi-supervised learning with multiple labels , 2009, J. Vis. Commun. Image Represent..

[10]  J. E. Kelley,et al.  The Cutting-Plane Method for Solving Convex Programs , 1960 .

[11]  Xiaobai Liu,et al.  Label to Region by BiLayer Sparsity Priors , 2009 .

[12]  Sally A. Goldman,et al.  MISSL: multiple-instance semi-supervised learning , 2006, ICML.

[13]  Dong Liu,et al.  Image retagging , 2010, ACM Multimedia.

[14]  Long Zhu,et al.  Unsupervised learning of probabilistic object models (POMs) for object classification, segmentation and recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Thierry Pun,et al.  The Truth about Corel - Evaluation in Image Retrieval , 2002, CIVR.

[16]  Edward Y. Chang,et al.  CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[17]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[18]  Hai Jin,et al.  Label to region by bi-layer sparsity priors , 2009, MM '09.

[19]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[20]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[21]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[22]  Christopher J. C. Burges,et al.  Spectral clustering and transductive learning with multiple views , 2007, ICML '07.

[23]  Fei-Fei Li,et al.  Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[24]  Shih-Fu Chang,et al.  To search or to label?: predicting the performance of search-based automatic image classifiers , 2006, MIR '06.

[25]  Bernhard Schölkopf,et al.  Learning with Hypergraphs: Clustering, Classification, and Embedding , 2006, NIPS.

[26]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[27]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.

[28]  Yi Liu,et al.  Semi-supervised Multi-label Learning by Constrained Non-negative Matrix Factorization , 2006, AAAI.

[29]  Changhu Wang,et al.  Content-Based Image Annotation Refinement , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.