Web Video Geolocation by Geotagged Social Resources

This paper considers the problem of web video geolocation: we hope to determine where on the Earth a web video was taken. By analyzing a 6.5-million geotagged web video dataset, we observe that there exist inherent geography intimacies between a video with its relevant videos (related videos and same-author videos). This social relationship supplies a direct and effective cue to locate the video to a particular region on the earth. Based on this observation, we propose an effective web video geolocation algorithm by propagating geotags among the web video social relationship graph. For the video that have no geotagged relevant videos, we aim to collect those geotagged relevant images that are content similar with the video (share some visual or textual information with the video) as the cue to infer the location of the video. The experiments have demonstrated the effectiveness of both methods, with the geolocation accuracy much better than state-of-the-art approaches. Finally, an online web video geolocation system: Video2Locatoin (V2L) is developed to provide public access to our algorithm.

[1]  Richard Szeliski,et al.  Building Rome in a day , 2009, ICCV.

[2]  Jon M. Kleinberg,et al.  Spatial variation in search engine queries , 2008, WWW.

[3]  Alexei A. Efros,et al.  Image sequence geolocation with human travel priors , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4]  Ron Sivan,et al.  Web-a-where: geotagging web content , 2004, SIGIR '04.

[5]  Lars Backstrom,et al.  Find me if you can: improving geographical prediction with social and spatial proximity , 2010, WWW '10.

[6]  Jiebo Luo,et al.  Enhancing semantic and geographic annotation of web images via logistic canonical correlation regression , 2009, ACM Multimedia.

[7]  Jiebo Luo,et al.  Annotating photo collections by label propagation according to multiple similarity cues , 2008, ACM Multimedia.

[8]  Jiebo Luo,et al.  Exploring user image tags for geo-location inference , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  B. S. Manjunath,et al.  Global annotation on georeferenced photographs , 2009, CIVR '09.

[10]  Jon M. Kleinberg,et al.  Mapping the world's photos , 2009, WWW '09.

[11]  K. Sahr,et al.  Geodesic Discrete Global Grid Systems , 2003 .

[12]  T. Geisel,et al.  The scaling laws of human travel , 2006, Nature.

[13]  Jiebo Luo,et al.  Geotagging in multimedia and computer vision—a survey , 2010, Multimedia Tools and Applications.

[14]  Trevor Darrell,et al.  Multimodal location estimation , 2010, ACM Multimedia.

[15]  Pavel Serdyukov,et al.  Placing flickr photos on a map , 2009, SIGIR.

[16]  Ronald Rosenfeld,et al.  Semi-supervised learning with graphs , 2005 .

[17]  Mor Naaman,et al.  Generating diverse and representative image search results for landmarks , 2008, WWW.

[18]  Alexei A. Efros,et al.  IM2GPS: estimating geographic information from a single image , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Daniel P. Huttenlocher,et al.  Landmark classification in large-scale image collections , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[20]  Steven Schockaert,et al.  Ghent University at the 2010 placing task , 2010 .

[21]  Jiangchuan Liu,et al.  UGC Video Sharing: Measurement and Analysis , 2010, Intelligent Multimedia Communication.

[22]  Paul C. van Oorschot,et al.  Internet geolocation: Evasion and counterevasion , 2009, CSUR.

[23]  Jiangchuan Liu,et al.  Understanding the Characteristics of Internet Short Video Sharing: A YouTube-Based Measurement Study , 2013, IEEE Transactions on Multimedia.

[24]  Jiebo Luo,et al.  The wisdom of social multimedia: using flickr for prediction and forecast , 2010, ACM Multimedia.

[25]  Chong-Wah Ngo,et al.  On the Annotation of Web Videos by Efficient Near-Duplicate Search , 2010, IEEE Transactions on Multimedia.

[26]  Jiebo Luo,et al.  Using Geotags to Derive Rich Tag-Clouds for Image Annotation , 2011, Social Media Modeling and Computing.

[27]  Yang Song,et al.  Tour the world: a technical demonstration of a web-scale landmark recognition engine , 2009, ACM Multimedia.

[28]  Yue Gao,et al.  W2Go: a travel guidance system by automatic landmark ranking , 2010, ACM Multimedia.

[29]  Yongdong Zhang,et al.  Tag transformer , 2010, ACM Multimedia.

[30]  Jiebo Luo,et al.  Inferring generic activities and events from image content and bags of geo-tags , 2008, CIVR '08.

[31]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[32]  Jiebo Luo,et al.  Beyond pixels: Exploiting camera metadata for photo classification , 2005, Pattern Recognit..

[33]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Yang Song,et al.  Tour the world: Building a web-scale landmark recognition engine , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Keiji Yanai,et al.  A visual analysis of the relationship between word concepts and geographical locations , 2009, CIVR '09.

[36]  Yannis Avrithis,et al.  Retrieving landmark and non-landmark images from community photo collections , 2010, ACM Multimedia.

[37]  T. Sikora,et al.  Video2GPS: Geotagging using collaborative systems, textual and visual features , 2010 .

[38]  Alexei A. Efros,et al.  Large scale scene matching for graphics and vision , 2009 .

[39]  Xing Xie,et al.  Mining city landmarks from blogs by graph modeling , 2009, ACM Multimedia.