Multimodal Spatio-Temporal Theme Modeling for Landmark Analysis

Here, we discuss mining and summarizing landmarks' general themes as well as the local and temporal themes. General themes occur extensively in various landmarks, and include accommodations and other standard features. The local theme implies a specific theme that exists only at a certain landmark, such as a unique physical characteristic. The temporal theme corresponds to the location-time-representative pattern, which relates only to a certain landmark during a certain period-such as fleet week at the Golden Gate Bridge or red maple leaves in Kiyomizu-dera. Local themes are useful in landmark analysis for their discriminative and representative attributes. However, the ability to discover landmark diversity at different moments makes temporal themes equally important in landmark studies. Time dependent diversity shows complete viewing angles over time and complements local themes in landmark understanding. Furthermore, it provides more comprehensive and structured information for landmark history browsing and tourist decision making. We propose a probabilistic topic model called Multimodal Spatio-Temporal Theme Modeling (mmSTTM). The model considers both textual and visual contexts to learn general, local, and temporal themes, which span a low-dimensional theme space. The model also assigns all textual and visual keywords to each theme, along with a probability for each; a keyword with high weight assignment is meaningful for the theme, while low-weighted keywords are considered noise.

[1]  Chao Liu,et al.  A probabilistic approach to spatiotemporal theme pattern mining on weblogs , 2006, WWW '06.

[2]  Mor Naaman,et al.  Generating diverse and representative image search results for landmarks , 2008, WWW.

[3]  Yannis Avrithis,et al.  Retrieving landmark and non-landmark images from community photo collections , 2010, ACM Multimedia.

[4]  Yang Song,et al.  Tour the world: Building a web-scale landmark recognition engine , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Yue Gao,et al.  W2Go: a travel guidance system by automatic landmark ranking , 2010, ACM Multimedia.

[6]  Jiawei Han,et al.  Geographical topic discovery and comparison , 2011, WWW.

[7]  Daniel P. Huttenlocher,et al.  Landmark classification in large-scale image collections , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  Changhu Wang,et al.  Photo2Trip: generating travel routes from geo-tagged photos for trip planning , 2010, ACM Multimedia.

[9]  George Loizou,et al.  Computer vision and pattern recognition , 2007, Int. J. Comput. Math..

[10]  Ming Yang,et al.  Query Specific Fusion for Image Retrieval , 2012, ECCV.

[11]  Padhraic Smyth,et al.  Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model , 2006, NIPS.

[12]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[13]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14]  Tanji Hu,et al.  Summarizing tourist destinations by mining user-generated travelogues and photos , 2011, Comput. Vis. Image Underst..

[15]  Bingbing Ni,et al.  Efficient region-aware large graph construction towards scalable multi-label propagation , 2011, Pattern Recognit..

[16]  Xing Xie,et al.  Mining city landmarks from blogs by graph modeling , 2009, ACM Multimedia.

[17]  Changsheng Xu,et al.  General Subspace Learning With Corrupted Training Data Via Graph Embedding , 2013, IEEE Transactions on Image Processing.

[18]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.