Mining city landmarks from blogs by graph modeling

Recent years have witnessed great prosperity in community-contributed multimedia. Discovering, extracting, and summarizing knowledge from these data enables us to make better sense of the world. In this paper, we report our work on mining famous city landmarks from blogs for personalized tourist suggestions. Our main contribution is a graph modeling framework to discover city landmarks by mining blog photo correlations with community supervision. This modeling fuses context, content, and community information in a style that simulates both static (PageRank) and dynamic (HITS) ranking models to highlight representative data from the consensus of blog users. Preliminary, we identify geographical locations of page contents to harvest city sight photos from Web blogs, based on which we structure these photos into a Scene-View hierarchy* within each city. Our graph modeling consists of two phases: First, within a given scene, we present a PhotoRank algorithm to discover its representative views, which analogizes PageRank to model context and content photo correlations for graph-based popularity propagation. Second, among scenes within each city, we present a Landmark-HITS model to discover city landmarks, which integrates author correlations to infer scene popularity in a semi-supervised reinforcement manner. Based on graph modeling, we further achieve personalized tourist suggestions by the collaborative filtering of tourism logs and author correlations. Based on a real-world dataset from Windows Live Spaces blogs containing nearly 400,000 sight photos, we have deployed our framework in a VisualTourism system, with comparisons to state-of-the-arts. We also investigate how the city popularities, user locations (e.g. Asian or Euro. blog users), and sequential events (e.g. Olympic Games) influence our Landmark discovery results and the tourist suggestion tendencies.

[1]  Wei-Ying Ma,et al.  Discovering Authoritative News Sources and Top News Stories , 2006, AIRS.

[2]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Wei-Ying Ma,et al.  VirtualTour: an online travel assistant based on high quality images , 2006, MM '06.

[4]  Pablo César,et al.  Enhancing social sharing of videos: fragment, annotate, enrich, and share , 2008, ACM Multimedia.

[5]  Richard Szeliski,et al.  City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Xing Xie,et al.  Photo-to-Search: Using Camera Phones to Inquire of the Surrounding World , 2006, 7th International Conference on Mobile Data Management (MDM'06).

[7]  Charles L. Wayne Multilingual Topic Detection and Tracking: Successful Research Enabled by Corpora and Evaluation , 2000, LREC.

[8]  Xing Xie,et al.  Detecting geographic locations from web resources , 2005, GIR '05.

[9]  Steven M. Seitz,et al.  Scene Summarization for Online Image Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[10]  Douglas B. Terry,et al.  Using collaborative filtering to weave an information tapestry , 1992, CACM.

[11]  Mor Naaman,et al.  How flickr helps us make sense of the world: context and content in community-contributed media collections , 2007, ACM Multimedia.

[12]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[13]  Heikki Mannila,et al.  Discovery of Frequent Episodes in Event Sequences , 1997, Data Mining and Knowledge Discovery.

[14]  Edward Y. Chang,et al.  Extent: Inferring Image Metadata from Context and Content , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[15]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[16]  A. Smeaton,et al.  Combination of content analysis and context features for digital photograph retrieval. , 2005 .

[17]  Mor Naaman,et al.  Generating diverse and representative image search results for landmarks , 2008, WWW.

[18]  Shumeet Baluja,et al.  Pagerank for product image search , 2008, WWW.

[19]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[20]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[21]  Wei-Ying Ma,et al.  Block-level link analysis , 2004, SIGIR '04.