Multi-feature Late Fusion for Image Tagging

Image tagging plays a critical role in image indexing and retrieval and it has gained more and more attention along with the increasing availability of large quantities of web images. However, most of current tagging methods only utilize single feature type, while combining multiple types of features has been proved to be effective for image analysis. In this paper, we propose a multi-feature late fusion method for image tagging. For an image, we first learn several scores with regard to each tag by using different single features or combinations of single features based on a tag relevance learner. Then we learn an optimal combination weight for each tag score and linearly combine all the tag scores with the learned weights. Finally, a low-rank tag pair wise matrix is learned with the linearly combined tag scores and a robust tag score is recovered from the low-rank matrix. The tags with the largest scores are regarded as the predicted tags. We compare our approach with several multi-feature fusion techniques over a real-world dataset NUSWIDE and show the effectiveness of the proposed multi-feature fusion method.

[1]  Zhiwu Lu,et al.  Image annotation by semantic sparse recoding of visual content , 2012, ACM Multimedia.

[2]  Ernest Valveny,et al.  Optimal Classifier Fusion in a Non-Bayesian Probabilistic Framework , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Yuan Yao,et al.  Statistical ranking and combinatorial Hodge theory , 2008, Math. Program..

[4]  Marcel Worring,et al.  Learning Social Tag Relevance by Neighbor Voting , 2009, IEEE Transactions on Multimedia.

[5]  Anil K. Jain,et al.  Likelihood Ratio-Based Biometric Score Fusion , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Shuang Wu,et al.  Multimodal feature fusion for robust event detection in web videos , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[8]  Vladimir Pavlovic,et al.  Baselines for Image Annotation , 2010, International Journal of Computer Vision.

[9]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Inderjit S. Dhillon,et al.  Guaranteed Rank Minimization via Singular Value Projection , 2009, NIPS.

[11]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[12]  David F. Gleich,et al.  Rank aggregation via nuclear norm minimization , 2011, KDD.

[13]  James Ze Wang,et al.  Real-Time Computerized Annotation of Pictures , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Marcel Worring,et al.  Unsupervised multi-feature tag relevance learning for social image retrieval , 2010, CIVR '10.

[15]  James Ze Wang,et al.  Tagging over time: real-world image annotation by lightweight meta-learning , 2007, ACM Multimedia.

[16]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[17]  Dong Liu,et al.  Robust late fusion with rank minimization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.