Annotating photo collections by label propagation according to multiple similarity cues

This paper considers the emerging problem of annotating personal photo collections that are taken by digital cameras and may have been subsequently organized by customers. Unlike the images from the web searching engine or commercial image banks (e.g. the Corel database), the photos in the same personal collection are related to each other in time, location, and content. Advanced technologies can record the GPS coordinates for each photo, and thus provide a richer source of context to model and enforce the correlation between the photos in the same collection. Recognizing the well-known limitations ("semantic gap") of visual recognition algorithms, we exploit the correlation between the photos to enhance the annotation performance. In our approach, high-confidence annotation labels are first obtained for certain photos and then propagated to the remaining photos in the same collection, according to time, location, and visual proximity (or similarity). A novel generative probabilistic model is employed, which outperforms the pervious linear propagation scheme. Experimental results have shown the advantages of the proposed annotation scheme.

[1]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[2]  Wei-Ying Ma,et al.  An adaptive graph model for automatic image annotation , 2006, MIR '06.

[3]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[4]  Jiebo Luo,et al.  Beyond pixels: Exploiting camera metadata for photo classification , 2005, Pattern Recognit..

[5]  Xuelong Li,et al.  Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Matthew A. Brown,et al.  Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.

[7]  Winston H. Hsu,et al.  Video Search and High-Level Feature Extraction , 2005 .

[8]  Mubarak Shah,et al.  Improving Semantic Concept Detection and Retrieval using Contextual Estimates , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[9]  Jingrui He,et al.  Manifold-ranking based image retrieval , 2004, MULTIMEDIA '04.

[10]  B. Reljin,et al.  Adaptive Content-Based Image Retrieval with Relevance Feedback , 2005, EUROCON 2005 - The International Conference on "Computer as a Tool".

[11]  HongJiang Zhang,et al.  Detecting image orientation based on low-level visual content , 2004, Comput. Vis. Image Underst..

[12]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Christopher J. C. Burges,et al.  Spectral clustering and transductive learning with multiple views , 2007, ICML '07.

[14]  John R. Smith,et al.  IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[15]  Luc Van Gool,et al.  Modeling scenes with local descriptors and latent aspects , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[16]  Anil K. Jain,et al.  Image retrieval using color and shape , 1996, Pattern Recognit..

[17]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18]  Xian-Sheng Hua,et al.  Video search re-ranking via multi-graph propagation , 2007, ACM Multimedia.

[19]  Michael R. Lyu,et al.  A semi-supervised active learning framework for image retrieval , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[21]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[22]  Wei-Ying Ma,et al.  Graph based multi-modality learning , 2005, ACM Multimedia.

[24]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[25]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[26]  Thomas S. Huang,et al.  Content-based image retrieval with relevance feedback in MARS , 1997, Proceedings of International Conference on Image Processing.

[27]  Helen C. Shen,et al.  Semi-Supervised Classification Using Linear Neighborhood Propagation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).