Searching Twitter: Separating the Tweet from the Chaff

Within the millions of digital communications posted in online social networks, there is undoubtedly some valuable and useful information. Although a large portion of social media content is considered to be babble, research shows that people share useful links, provide recommendations to friends, answer questions, and solve problems. In this paper, we report on a qualitative investigation into the different factors that make tweets ‘useful’ and ‘not useful’ for a set of common search tasks. The investigation found 16 features that help make a tweet useful, noting that useful tweets often showed 2 or 3 of these features. ‘Not useful’ tweets, however, typically had only one of 17 clear and striking features. Further, we saw that these features can be weighted as according to different types of search tasks. Our results contribute a novel framework for extracting useful information from real-time streams of social-media content that will be used in the design of a future retrieval system.

[1]  A Cohen,et al.  Do you see what I see? , 1979, Nursing mirror.

[2]  A. Strauss,et al.  The discovery of grounded theory: strategies for qualitative research aldine de gruyter , 1968 .

[3]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[4]  Miles Osborne,et al.  Streaming First Story Detection with application to Twitter , 2010, NAACL.

[5]  Damon Horowitz,et al.  The anatomy of a large-scale social search engine , 2010, WWW '10.

[6]  Meredith Ringel Morris,et al.  What do people ask their social networks, and why?: a survey study of status message q&a behavior , 2010, CHI.

[7]  Devin Gaffney #iranElection: quantifying online activism , 2010 .

[8]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[9]  Jeremy Pickens,et al.  Interactive information seeking via selective application of contextual knowledge , 2010, IIiX.

[10]  Michael S. Bernstein,et al.  Short and tweet: experiments on recommending content from information streams , 2010, CHI.

[11]  Miles Efron,et al.  Questions are content: A taxonomy of questions in a microblogging environment , 2010, ASIST.

[12]  Michael S. Bernstein,et al.  Enhancing directed content sharing on the web , 2010, CHI.

[13]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[14]  J. Golbeck,et al.  FilmTrust: movie recommendations using trust in web-based social networks , 2006, CCNC 2006. 2006 3rd IEEE Consumer Communications and Networking Conference, 2006..

[15]  Danah Boyd,et al.  Detecting Spam in a Twitter Network , 2009, First Monday.

[16]  Marc Cheong,et al.  Integrating web-based intelligence retrieval and decision-making from the twitter trends knowledge base , 2009, CIKM-SWSM.

[17]  Rocío Rivadeneyra,et al.  Do You See What I See? , 2006 .

[18]  Max L. Wilson Casual-leisure Searching: the Exploratory Search scenarios that break our current models , 2010 .

[19]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[20]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[21]  Meredith Ringel Morris,et al.  #TwitterSearch: a comparison of microblog search and web search , 2011, WSDM '11.

[22]  Danah Boyd,et al.  Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter , 2010, 2010 43rd Hawaii International Conference on System Sciences.

[23]  Leysia Palen,et al.  Microblogging during two natural hazards events: what twitter may contribute to situational awareness , 2010, CHI.