Efficient Large-Scale Image Data Set Exploration: Visual Concept Network and Image Summarization

When large-scale online images come into view, it is very important to construct a framework for efficient data exploration. In this paper, we build exploration models based on two considerations: inter-concept visual correlation and intra-concept image summarization. For inter-concept visual correlation, we have developed an automatic algorithm to generate visual concept network which is characterized by the visual correlation between image concept pairs. To incorporate reliable inter-concept correlation contexts, multiple kernels are combined and a kernel canonical correlation analysis algorithm is used to characterize the diverse visual similarity contexts between the image concepts. For intra-concept image summarization, we propose a greedy algorithm to sequentially pick the best representation of the image concept set. The quality score for each candidate summary is computed based on the clustering result, which considers the relevancy, orthogonality and uniformity terms at the same time. Visualization techniques are developed to assist user on assessing the coherence between concept-pairs and investigating the visual properties within the concept. We have conducted experiments and user studies to evaluate both algorithms. We observed very good results and received positive feedback.

[1]  Jianping Fan,et al.  Hierarchical classification for automatic image annotation , 2007, SIGIR.

[2]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[3]  Nenghai Yu,et al.  Flickr distance , 2008, ACM Multimedia.

[4]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[5]  Rong Yan,et al.  Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News , 2007, IEEE Transactions on Multimedia.

[6]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[7]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[8]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[9]  Jing Huang,et al.  An automatic hierarchical image classification scheme , 1998, MULTIMEDIA '98.

[10]  Jianping Fan,et al.  Generating Visual Concept Network from Large-Scale Weakly-Tagged Images , 2010, MMM.

[11]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[12]  Jianping Fan,et al.  An Interactive Approach for Filtering Out Junk Images From Keyword-Based Google Search Results , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Nuno Vasconcelos,et al.  Image indexing with mixture hierarchies , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Shumeet Baluja,et al.  Canonical image selection from the web , 2007, CIVR '07.

[15]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[16]  Steven M. Seitz,et al.  Scene Summarization for Online Image Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[17]  Shih-Fu Chang,et al.  MediaNet: a multimedia information network for knowledge representation , 2000, SPIE Optics East.

[18]  Antonio Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, CVPR 2004.

[19]  Milind R. Naphade,et al.  A probabilistic framework for semantic video indexing, filtering, and retrieval , 2001, IEEE Trans. Multim..

[20]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[21]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Milind R. Naphade,et al.  A probabilistic framework for semantic indexing and retrieval in video , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[23]  David A. Forsyth,et al.  Learning the semantics of words and pictures , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.