Eddi: interactive topic-based browsing of social status streams

Twitter streams are on overload: active users receive hundreds of items per day, and existing interfaces force us to march through a chronologically-ordered morass to find tweets of interest. We present an approach to organizing a user's own feed into coherently clustered trending topics for more directed exploration. Our Twitter client, called Eddi, groups tweets in a user's feed into topics mentioned explicitly or implicitly, which users can then browse for items of interest. To implement this topic clustering, we have developed a novel algorithm for discovering topics in short status updates powered by linguistic syntactic transformation and callouts to a search engine. An algorithm evaluation reveals that search engine callouts outperform other approaches when they employ simple syntactic transformation and backoff strategies. Active Twitter users evaluated Eddi and found it to be a more efficient and enjoyable way to browse an overwhelming status update feed than the standard chronological interface.

[1]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[2]  W. Bruce Croft,et al.  Discovering key concepts in verbose queries , 2008, SIGIR '08.

[3]  Kevin Li,et al.  Faceted metadata for image search and browsing , 2003, CHI '03.

[4]  Panagiotis G. Ipeirotis,et al.  Automatic Extraction of Useful Facet Hierarchies from Text Databases , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[5]  Daniela Karin Rosner,et al.  Tag Clouds: Data Analysis Tool or Social Signaller? , 2008, Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008).

[6]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[7]  Jun Zhang,et al.  A case study of micro-blogging in the enterprise: use, value, and related issues , 2010, CHI.

[8]  Eric Baumer,et al.  Smarter Blogroll: An Exploration of Social Topic Extraction for Manageable Blogrolls , 2008, Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008).

[9]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[10]  Marti A. Hearst Search User Interfaces , 2009 .

[11]  Yvonne Kammerer,et al.  Signpost from the masses: learning effects in an exploratory social tag search browser , 2009, CHI.

[12]  Martin Wattenberg,et al.  Designing for social data analysis , 2006, IEEE Transactions on Visualization and Computer Graphics.

[13]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.

[14]  Marti A. Hearst Clustering versus faceted categories for information exploration , 2006, Commun. ACM.

[15]  Mehran Sahami,et al.  A web-based kernel function for measuring the similarity of short text snippets , 2006, WWW '06.

[16]  Lucy T. Nowell,et al.  ThemeRiver: Visualizing Thematic Changes in Large Document Collections , 2002, IEEE Trans. Vis. Comput. Graph..

[17]  Mika Käki,et al.  Findex: search result categories help users when document ranking fails , 2005, CHI.

[18]  W. Bradford Paley,et al.  TextArc: Showing Word Frequency and Distribution in Text , 2002 .

[19]  Michael S. Bernstein,et al.  Short and tweet: experiments on recommending content from information streams , 2010, CHI.

[20]  Susan C. Herring,et al.  Beyond Microblogging: Conversation and Collaboration via Twitter , 2009, 2009 42nd Hawaii International Conference on System Sciences.

[21]  Susan T. Dumais,et al.  Characterizing Microblogs with Topic Models , 2010, ICWSM.

[22]  Kate Ehrlich,et al.  Microblogging Inside and Outside the Workplace , 2010, ICWSM.

[23]  Susan T. Dumais,et al.  Optimizing search by showing results in context , 2001, CHI.

[24]  Anette Hulth,et al.  Improved Automatic Keyword Extraction Given More Linguistic Knowledge , 2003, EMNLP.

[25]  David A. Shamma,et al.  Tweet the debates: understanding community annotation of uncollected sources , 2009, WSM@MM.

[26]  Mor Naaman,et al.  Is it really about me?: message content in social awareness streams , 2010, CSCW '10.

[27]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.