Faceted Browsing over Social Media

The popularity of social media as a medium for sharing information has made extracting information of interest a challenge. In this work we provide a system that can return posts published on social media covering various aspects of a concept being searched. We present a faceted model for navigating social media that provides a consistent, usable and domain-agnostic method for extracting information from social media. We present a set of domain independent facets and empirically prove the feasibility of mapping social media content to the facets we chose. Next, we show how we can map these facets to social media sites, living documents that change periodically to topics that capture the semantics expressed in them. This mapping is used as a graph to compute the various facets of interest to us. We learn a profile of the content creator, enable content to be mapped to semantic concepts for easy navigation and detect similarity among sites to either suggest similar pages or determine pages that express different views.

[1]  Yun Chi,et al.  Structural and temporal analysis of the blogosphere through community factorization , 2007, KDD '07.

[2]  Yong Yu,et al.  Optimizing web search using social annotations , 2007, WWW '07.

[3]  Iraklis Varlamis,et al.  BlogRank: ranking weblogs based on connectivity and similarity features , 2006, AAA-IDEA '06.

[4]  Iryna Gurevych,et al.  Using tag semantic network for keyphrase extraction in blogs , 2008, CIKM '08.

[5]  Lora Aroyo,et al.  The Semantic Web: Research and Applications , 2009, Lecture Notes in Computer Science.

[6]  Claude E. Shannon,et al.  Prediction and Entropy of Printed English , 1951 .

[7]  Max Welling,et al.  Fast collapsed gibbs sampling for latent dirichlet allocation , 2008, KDD.

[8]  Andreas Hotho,et al.  Information Retrieval in Folksonomies: Search and Ranking , 2006, ESWC.

[9]  Philip S. Yu,et al.  Identifying the influential bloggers in a community , 2008, WSDM '08.

[10]  Mohammad Ali Abbasi,et al.  TweetTracker: An Analysis Tool for Humanitarian and Disaster Relief , 2011, ICWSM.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[13]  Marti A. Hearst,et al.  Hierarchical faceted metadata in site search interfaces , 2002, CHI Extended Abstracts.

[14]  Beibei Li,et al.  Enhancing clustering blog documents by utilizing author/reader comments , 2007, ACM-SE 45.

[15]  Matthew Hurst,et al.  Deriving marketing intelligence from online discussion , 2005, KDD '05.

[16]  Christopher H. Brooks,et al.  Improved annotation of the blogosphere via autotagging and hierarchical clustering , 2006, WWW '06.

[17]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[18]  Christos Faloutsos,et al.  Patterns of Cascading Behavior in Large Blog Graphs , 2007, SDM.