A Unified Geolocation Framework for Web Videos

In this article, we propose a unified geolocation framework to automatically determine where on the earth a web video was shot. We analyze different social, visual, and textual relationships from a real-world dataset and find four relationships with apparent geography clues that can be used for web video geolocation. Then, the geolocation process is formulated as an optimization problem that simultaneously takes the social, visual, and textual relationships into consideration. The optimization problem is solved by an iterative procedure, which can be interpreted as a propagation of the geography information among the web video social network. Extensive experiments on a real-world dataset clearly demonstrate the effectiveness of our proposed framework, with the geolocation accuracy higher than state-of-the-art approaches.

[1]  Jon M. Kleinberg,et al.  Mapping the world's photos , 2009, WWW '09.

[2]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[3]  Tat-Seng Chua,et al.  Research and applications on georeferenced multimedia: a survey , 2010, Multimedia Tools and Applications.

[4]  Gerald Friedland,et al.  The 2010 ICSI Video Location Estimation System , 2010 .

[5]  Mohammad Soleymani,et al.  Automatic tagging and geotagging in video collections and communities , 2011, ICMR.

[6]  Adam Rae,et al.  Working Notes for the Placing Task at MediaEval 2011 , 2011, MediaEval.

[7]  Shuicheng Yan,et al.  Inferring semantic concepts from community-contributed images and noisy tags , 2009, ACM Multimedia.

[8]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[9]  TaeHyun Hwang,et al.  A Heterogeneous Label Propagation Algorithm for Disease Gene Discovery , 2010, SDM.

[10]  Hai Yang,et al.  ACM Transactions on Intelligent Systems and Technology - Special Section on Urban Computing , 2014 .

[11]  Yizhou Sun,et al.  Graph Regularized Transductive Classification on Heterogeneous Information Networks , 2010, ECML/PKDD.

[12]  Chong-Wah Ngo,et al.  On the Annotation of Web Videos by Efficient Near-Duplicate Search , 2010, IEEE Transactions on Multimedia.

[13]  Pavel Serdyukov,et al.  Placing flickr photos on a map , 2009, SIGIR.

[14]  Alexei A. Efros,et al.  IM2GPS: estimating geographic information from a single image , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Jurandy Almeida,et al.  A visual approach for video geocoding using bag-of-scenes , 2012, ICMR.

[16]  Steven Schockaert,et al.  Finding locations of flickr resources using language models and similarity search , 2011, ICMR.

[17]  Yongdong Zhang,et al.  Tracking Web Video Topics: Discovery, Visualization, and Monitoring , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  K. Sahr,et al.  Geodesic Discrete Global Grid Systems , 2003 .

[19]  Yongdong Zhang,et al.  Web Video Geolocation by Geotagged Social Resources , 2012, IEEE Transactions on Multimedia.

[20]  T. Geisel,et al.  The scaling laws of human travel , 2006, Nature.

[21]  Jon M. Kleinberg,et al.  Spatial variation in search engine queries , 2008, WWW.

[22]  Thomas Sikora,et al.  Multi-modal, multi-resource methods for placing Flickr videos on the map , 2011, ICMR.

[23]  Jurandy Almeida,et al.  A Multimodal Approach for Video Geocoding , 2012, MediaEval.

[24]  Bart Thomee,et al.  Working Notes for the Placing Task at MediaEval 2013 , 2013, MediaEval.

[25]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Trevor Darrell,et al.  Multimodal location estimation , 2010, ACM Multimedia.

[27]  Yongdong Zhang,et al.  Tag transformer , 2010, ACM Multimedia.

[28]  Yu He,et al.  The YouTube video recommendation system , 2010, RecSys '10.

[29]  Mor Naaman,et al.  World explorer: visualizing aggregate data from unstructured text in geo-referenced collections , 2007, JCDL '07.

[30]  Adrian Popescu CEA LIST's Participation at MediaEval 2013 Placing Task , 2013, MediaEval.

[31]  Jiebo Luo,et al.  Geotagging in multimedia and computer vision—a survey , 2010, Multimedia Tools and Applications.

[32]  Jurandy Almeida,et al.  RECOD Working Notes for Placing Task MediaEval 2011 , 2011, MediaEval.

[33]  Keiji Yanai,et al.  A visual analysis of the relationship between word concepts and geographical locations , 2009, CIVR '09.

[34]  Mor Naaman,et al.  Towards automatic extraction of event and place semantics from flickr tags , 2007, SIGIR.

[35]  Ramesh C. Jain,et al.  Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images , 2011, TIST.

[36]  Ron Sivan,et al.  Web-a-where: geotagging web content , 2004, SIGIR '04.

[37]  Dong Liu,et al.  Image Retagging Using Collaborative Tag Propagation , 2011, IEEE Transactions on Multimedia.