Global annotation on georeferenced photographs

We present an efficient world-scale system for providing automatic annotation on collections of geo-referenced photos. As a user uploads a photograph a place of origin is estimated from visual features which the user can refine. Once the correct location is provided, tags are suggested based on geographic and image similarity retrieved from a large database of 1.2 million images crawled from Flickr. The system effectively mines geographically relevant terms and ranks potential suggestion terms by their posterior probability given observed visual and geocoordinate features. A series of experiments analyzes the geocoordinate prediction accuracy and precision-recall metric of tags suggestions based on information retrieval techniques. The system is novel in that it fuses geographic and visual information to provide annotations for uploaded photographs taken anywhere in the world in a matter of seconds.

[1]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[2]  Mor Naaman,et al.  Generating diverse and representative image search results for landmarks , 2008, WWW.

[3]  Alexei A. Efros,et al.  IM2GPS: estimating geographic information from a single image , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Roberto Marcondes Cesar Junior,et al.  Quadtree-Based Inexact Graph Matching for Image Analysis , 2005, XVIII Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI'05).

[5]  Mor Naaman,et al.  World explorer: visualizing aggregate data from unstructured text in geo-referenced collections , 2007, JCDL '07.

[6]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[7]  W. S. Hsieh,et al.  Quadtree based perceptual watermarking scheme , 2006, ASIACCS '06.

[8]  Jiebo Luo,et al.  Inferring generic activities and events from image content and bags of geo-tags , 2008, CIVR '08.

[9]  L. Buydens,et al.  Knn density-based clustering for high dimensional multispectral images , 2003, 2003 2nd GRSS/ISPRS Joint Workshop on Remote Sensing and Data Fusion over Urban Areas.

[10]  Lutgarde M. C. Buydens,et al.  KNN-kernel density-based clustering for high-dimensional multivariate data , 2006, Comput. Stat. Data Anal..

[11]  Tommy W. S. Chow,et al.  Content-based image retrieval using growing hierarchical self-organizing quadtree map , 2005, Pattern Recognit..

[12]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[13]  B. S. Manjunath,et al.  Introduction to MPEG-7: Multimedia Content Description Interface , 2002 .

[14]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[16]  B. S. Manjunath,et al.  Spirittagger: a geo-aware tag suggestion tool mined from flickr , 2008, MIR '08.

[17]  Leo Grady,et al.  Faster graph-theoretic image processing via small-world and quadtree topologies , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[18]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[19]  Wei-Ying Ma,et al.  AnnoSearch: Image Auto-Annotation by Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20]  B. S. Manjunath,et al.  Automatic video annotation through search and mining , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[21]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Mahdi Asefi,et al.  Classification-Based Adaptive SearchAlgorithm for Video Motion Estimation , 2006 .

[23]  Mor Naaman,et al.  How flickr helps us make sense of the world: context and content in community-contributed media collections , 2007, ACM Multimedia.