GeoFolk: latent spatial semantics in web 2.0 social media

We describe an approach for multi-modal characterization of social media by combining text features (e.g. tags as a prominent example of short, unstructured text labels) with spatial knowledge (e.g. geotags and coordinates of images and videos). Our model-based framework GeoFolk combines these two aspects in order to construct better algorithms for content management, retrieval, and sharing. The approach is based on multi-modal Bayesian models which allow us to integrate spatial semantics of social media in a well-formed, probabilistic manner. We systematically evaluate the solution on a subset of Flickr data, in characteristic scenarios of tag recommendation, content classification, and clustering. Experimental results show that our method outperforms baseline techniques that are based on one of the aspects alone. The approach described in this contribution can also be used in other domains such as Geoweb retrieval.

[1]  Danielle S. McNamara,et al.  Handbook of latent semantic analysis , 2007 .

[2]  Ciro Cattuto,et al.  Semantic Grounding of Tag Relatedness in Social Bookmarking Systems , 2008, SEMWEB.

[3]  P. Schmitz,et al.  Inducing Ontology from Flickr Tags , 2006 .

[4]  Andrew McCallum,et al.  Group and topic discovery from relations and text , 2005, LinkKDD '05.

[5]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[6]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[7]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[8]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[9]  Steffen Staab,et al.  Introducing Triple Play for Improved Resource Retrieval in Collaborative Tagging Systems , 2008 .

[10]  Chee Wee Leong,et al.  Exploiting Wikipedia for Directional Inferential Text Similarity , 2008, Fifth International Conference on Information Technology: New Generations (itng 2008).

[11]  Hector Garcia-Molina,et al.  Social tag prediction , 2008, SIGIR '08.

[12]  Flemming Topsøe,et al.  Jensen-Shannon divergence and Hilbert space embedding , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..

[13]  Hongyuan Zha,et al.  Exploring social annotations for information retrieval , 2008, WWW.

[14]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[15]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[16]  Andreas Hotho,et al.  Tag Recommendations in Folksonomies , 2007, LWA.

[17]  David J. Spiegelhalter,et al.  VIBES: A Variational Inference Engine for Bayesian Networks , 2002, NIPS.

[18]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[19]  Peter Mika Ontologies Are Us: A Unified Model of Social Networks and Semantics , 2005, International Semantic Web Conference.