Annotating personal albums via web mining

Nowadays personal albums are becoming more and more popular due to the explosive growth of digital image capturing devices. An effective automatic annotation system for personal albums is desired for both efficient browsing and search. Existing research on image annotation evolves through two stages: learning-based methods and web-based methods. Learning-based methods attempt to learn classifiers or joint probabilities between images and concepts, which are difficult to handle large-scale concept sets due to the lack of training data. Web-based methods leverage web image data to learn relevant annotations, which greatly expand the scale of concepts. However, they still suffer two problems: the query image lacks prior knowledge and the annotations are often noisy and incoherent. To address the above issues, we propose a web-based annotation approach to annotate a collection of photos simultaneously, instead of annotating them independently, by leveraging the abundant correlations among the photos. A multi-graph similarity propagation based semi-supervised learning (MGSP-SSL) algorithm is proposed to suppress the noises in the initial annotations from the Web. Experiments on real personal albums show that the proposed approach outperforms existing annotation methods.

[1]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[2]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[3]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[4]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[5]  Meng Wang,et al.  Optimizing multi-graph learning: towards a unified video annotation scheme , 2007, ACM Multimedia.

[6]  Andreas Girgensohn,et al.  Temporal event clustering for digital photo collections , 2005, ACM Trans. Multim. Comput. Commun. Appl..

[7]  Wei-Ying Ma,et al.  Learning an image manifold for retrieval , 2004, MULTIMEDIA '04.

[8]  Mingjing Li,et al.  Automated annotation of human faces in family albums , 2003, MULTIMEDIA '03.

[9]  Wei-Ying Ma,et al.  A Probabilistic Semantic Model for Image Annotation and Multi-Modal Image Retrieva , 2005, ICCV.

[10]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[11]  Edward Y. Chang,et al.  CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[12]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[13]  Wei-Ying Ma,et al.  A probabilistic semantic model for image annotation and multi-modal image retrieval , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[14]  Forouzan Golshani EIC's Message: Multimedia is Correlated Media , 2004, IEEE Multim..

[15]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[16]  Nello Cristianini,et al.  Learning Semantic Similarity , 2002, NIPS.

[17]  Tao Mei,et al.  Temporally Consistent Gaussian Random Field for Video Semantic Analysis , 2007, 2007 IEEE International Conference on Image Processing.

[18]  Yun Chi,et al.  Detecting splogs via temporal dynamics using self-similarity analysis , 2008, TWEB.

[19]  Andreas Girgensohn,et al.  Temporal event clustering for digital photo collections , 2003, ACM Multimedia.

[20]  Wei-Ying Ma,et al.  Multi-model similarity propagation and its application for web image retrieval , 2004, MULTIMEDIA '04.

[21]  Bin Wang,et al.  Dual cross-media relevance model for image annotation , 2007, ACM Multimedia.

[22]  Yuandong Tian,et al.  EasyAlbum: an interactive photo annotation system based on face clustering and re-ranking , 2007, CHI.

[23]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[24]  Changhu Wang,et al.  Scalable search-based image annotation of personal images , 2006, MIR '06.

[25]  Wei-Ying Ma,et al.  AnnoSearch: Image Auto-Annotation by Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[26]  James Ze Wang,et al.  Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Wei-Ying Ma,et al.  Bipartite graph reinforcement model for web image annotation , 2007, ACM Multimedia.

[29]  David G. Lowe,et al.  Local feature view clustering for 3D object recognition , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[30]  Wei-Ying Ma,et al.  An adaptive graph model for automatic image annotation , 2006, MIR '06.

[31]  Susanne Boll,et al.  Semantics, content, and structure of many for the creation of personal photo albums , 2007, ACM Multimedia.

[32]  Lynda Hardman,et al.  Creating meaningful multimedia presentations , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[33]  Ge Yu,et al.  Similarity Match Over High Speed Time-Series Streams , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[34]  Raimondo Schettini,et al.  Image annotation using SVM , 2003, IS&T/SPIE Electronic Imaging.

[35]  Wolfgang Klas,et al.  Context-Aware Multimedia , 2006, Encyclopedia of Multimedia.

[36]  Laurent Amsaleg,et al.  Scalability of local image descriptors: a comparative study , 2006, MM '06.

[37]  Lexing Xie,et al.  Contextual wisdom: social relations and correlations for multimedia event annotation , 2007, ACM Multimedia.