Quantitative Characterization of Semantic Gaps for Learning Complexity Estimation and Inference Model Selection

In this paper, a novel data-driven algorithm is developed for achieving quantitative characterization of the semantic gaps directly in the visual feature space, where the visual feature space is the common space for concept classifier training and automatic concept detection. By supporting quantitative characterization of the semantic gaps, more effective inference models can automatically be selected for concept classifier training by: (1) identifying the image concepts with small semantic gaps (i.e., the isolated image concepts with high inner-concept visual consistency) and training their one-against-all SVM concept classifiers independently; (2) determining the image concepts with large semantic gaps (i.e., the visually-related image concepts with low inner-concept visual consistency) and training their inter-related SVM concept classifiers jointly; and (3) using more image instances to achieve more reliable training of the concept classifiers for the image concepts with large semantic gaps. Our experimental results on NUS-WIDE and ImageNet image sets have obtained very promising results.

[1]  Changhu Wang,et al.  Learning to reduce the semantic gap in web image retrieval and annotation , 2008, SIGIR '08.

[2]  Bob J. Wielinga,et al.  Ontology-Based Photo Annotation , 2001, IEEE Intell. Syst..

[3]  Alexei A. Efros,et al.  Unsupervised discovery of visual object class hierarchies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Jianping Fan,et al.  Integrating Concept Ontology and Multitask Learning to Achieve More Effective Classifier Training for Multilevel Image Annotation , 2008, IEEE Transactions on Image Processing.

[6]  Narendra Ahuja,et al.  Learning the Taxonomy and Models of Categories Present in Arbitrary Images , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[7]  Rong Yan,et al.  How many high-level concepts will fill the semantic gap in news video retrieval? , 2007, CIVR '07.

[8]  Pietro Perona,et al.  Unsupervised learning of visual taxonomies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Rong Yan,et al.  Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News , 2007, IEEE Transactions on Multimedia.

[10]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[11]  Peter G. B. Enser,et al.  Towards a Comprehensive Survey of the Semantic Gap in Visual Image Retrieval , 2003, CIVR.

[12]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  William I. Grosky,et al.  Narrowing the semantic gap - improved text-based web document retrieval using visual features , 2002, IEEE Trans. Multim..

[14]  Qi Tian,et al.  Constructing Concept Lexica With Small Semantic Gaps , 2010, IEEE Transactions on Multimedia.

[15]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[16]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[17]  Nuno Vasconcelos,et al.  Bridging the Gap: Query by Semantic Example , 2007, IEEE Transactions on Multimedia.

[18]  Fei-Fei Li,et al.  Building and using a semantivisual image hierarchy , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Chitra Dorai,et al.  Bridging the semantic gap with computational media aesthetics , 2003, IEEE MultiMedia.

[20]  Christoph H. Lampert,et al.  Learning to Localize Objects with Structured Output Regression , 2008, ECCV.

[21]  Thomas Deselaers,et al.  Visual and semantic similarity in ImageNet , 2011, CVPR 2011.

[22]  Jonathon S. Hare,et al.  Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and Bottom-up approaches , 2006 .

[23]  Cordelia Schmid,et al.  Semantic Hierarchies for Visual Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  John R. Smith,et al.  Semantic representation: search and mining of multimedia content , 2004, KDD '04.

[26]  Simone Santini,et al.  Emergent Semantics through Interaction in Image Databases , 2001, IEEE Trans. Knowl. Data Eng..

[27]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[28]  Marcel Worring,et al.  The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Nenghai Yu,et al.  Flickr distance , 2008, ACM Multimedia.

[30]  Jianping Fan,et al.  Mining Multilevel Image Semantics via Hierarchical Classification , 2008, IEEE Transactions on Multimedia.

[31]  Cordelia Schmid,et al.  Constructing Category Hierarchies for Visual Recognition , 2008, ECCV.

[32]  Qi Tian,et al.  What are the high-level concepts with small semantic gaps? , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Michael R. Lyu,et al.  Bridging the Semantic Gap Between Image Contents and Tags , 2010, IEEE Transactions on Multimedia.

[34]  Pietro Perona,et al.  Learning and using taxonomies for fast visual categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.