Leveraging large-scale weakly-tagged images to train inter-related classifiers for multi-label annotation

In this paper, we have developed a new multi-label multi-task learning framework to leverage large-scale weakly-tagged images for inter-related classifier training. A novel image and tag cleansing algorithm is developed for tackling the issues of spam, synonymous, loose and ambiguous tags and obtain more relevant images. The visual concept network is generated to characterize the inter-concept visual similarity contexts precisely and determine the inter-related learning tasks automatically. Through a multi-label multi-task learning paradigm, our structured max-margin learning algorithm can leverage both large-scale weakly-tagged images and the visual concept network to learn large amounts of inter-related classifiers for supporting multi-label image annotation.

[1]  Jianping Fan,et al.  An Interactive Approach for Filtering Out Junk Images From Keyword-Based Google Search Results , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[4]  Pietro Perona,et al.  A Visual Category Filter for Google Images , 2004, ECCV.

[5]  Fei-Fei Li,et al.  OPTIMOL: Automatic Online Picture Collection via Incremental Model Learning , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Hava T. Siegelmann,et al.  Support Vector Clustering , 2002, J. Mach. Learn. Res..

[7]  Martial Hebert,et al.  Discriminative Random Fields , 2006, International Journal of Computer Vision.

[8]  Manik Varma,et al.  Learning The Discriminative Power-Invariance Trade-Off , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9]  Wei-Ying Ma,et al.  Hierarchical clustering of WWW image search results using visual, textual and link information , 2004, MULTIMEDIA '04.

[10]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[11]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[12]  Mark A. Girolami,et al.  Mercer kernel-based clustering in feature space , 2002, IEEE Trans. Neural Networks.

[13]  Antonio Criminisi,et al.  Harvesting Image Databases from the Web , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  Antonio Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, CVPR 2004.

[15]  Eric P. Xing,et al.  Harmonium Models for Semantic Video Representation and Classification , 2007, SDM.

[16]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[17]  Wei-Ying Ma,et al.  Multi-model similarity propagation and its application for web image retrieval , 2004, MULTIMEDIA '04.

[18]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[19]  Shih-Fu Chang,et al.  Context-Based Concept Fusion with Boosted Conditional Random Fields , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[20]  Jianping Fan,et al.  Automatic image annotation by incorporating feature hierarchy and boosting to scale up SVM classifiers , 2006, MM '06.

[21]  Rong Jin,et al.  Web image retrieval re-ranking with relevance model , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[22]  Jitendra Malik,et al.  Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[23]  David A. Forsyth,et al.  Animals on the Web , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[24]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[25]  Jianping Fan,et al.  Integrating Concept Ontology and Multitask Learning to Achieve More Effective Classifier Training for Multilevel Image Annotation , 2008, IEEE Transactions on Image Processing.

[26]  Jianping Fan,et al.  JustClick: Personalized Image Recommendation via Exploratory Search From Large-Scale Flickr Images , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[27]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[28]  S. Lazebnik,et al.  Local Features and Kernels for Classification of Texture and Object Categories: An In-Depth Study , 2005 .