Information search and retrieval in microblogs

Modern information retrieval (IR) has come to terms with numerous new media in efforts to help people find information in increasingly diverse settings. Among these new media are so-called microblogs. A microblog is a stream of text that is written by an author over time. It comprises many very brief updates that are presented to the microblog's readers in reverse-chronological order. Today, the service called Twitter is the most popular microblogging platform. Although microblogging is increasingly popular, methods for organizing and providing access to microblog data are still new. This review offers an introduction to the problems that face researchers and developers of IR systems in microblog settings. After an overview of microblogs and the behavior surrounding them, the review describes established problems in microblog retrieval, such as entity search and sentiment analysis, and modeling abstractions, such as authority and quality. The review also treats user-created metadata that often appear in microblogs. Because the problem of microblog search is so new, the review concludes with a discussion of particularly pressing research issues yet to be studied in the field. © 2011 Wiley Periodicals, Inc.

[1]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[2]  Vasileios Hatzivassiloglou,et al.  Predicting the Semantic Orientation of Adjectives , 1997, ACL.

[3]  Balachander Krishnamurthy,et al.  A few chirps about twitter , 2008, WOSN '08.

[4]  Susan T. Dumais,et al.  What should blog search look like? , 2008, SSM '08.

[5]  Ed H. Chi,et al.  Towards a model of understanding social search , 2008, SSM '08.

[6]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[7]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[8]  Meredith Ringel Morris,et al.  #TwitterSearch: a comparison of microblog search and web search , 2011, WSDM '11.

[9]  Prabhakar Raghavan,et al.  Computing on data streams , 1999, External Memory Algorithms.

[10]  M. de Rijke,et al.  A language modeling framework for expert finding , 2009, Inf. Process. Manag..

[11]  Johan Bollen,et al.  Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena , 2009, ICWSM.

[12]  Craig MacDonald,et al.  The voting model for people search , 2009, SIGF.

[13]  David Geer Is It Really Time for Real-Time Search? , 2010, Computer.

[14]  Evgeny V. Morozov,et al.  Iran: Downside to the "Twitter Revolution" , 2009 .

[15]  Miles Efron,et al.  Questions are content: A taxonomy of questions in a microblogging environment , 2010, ASIST.

[16]  Fernando Diaz,et al.  Time is of the essence: improving recency ranking using Twitter data , 2010, WWW '10.

[17]  M. de Rijke,et al.  Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[18]  Peter Bailey,et al.  Overview of the TREC 2008 Enterprise Track , 2008, TREC.

[19]  Meredith Ringel Morris,et al.  What do people ask their social networks, and why?: a survey study of status message q&a behavior , 2010, CHI.

[20]  Ricardo Baeza-Yates,et al.  Clustering and exploring search results using timeline constructions , 2009, CIKM.

[21]  Jeffrey Pomerantz,et al.  Evaluating and predicting answer quality in community QA , 2010, SIGIR.

[22]  Haewoon Kwak,et al.  Finding influentials based on the temporal order of information adoption in twitter , 2010, WWW '10.

[23]  David A. Shamma,et al.  Characterizing debate performance via aggregated twitter sentiment , 2010, CHI.

[24]  Daniel Gayo-Avello,et al.  Nepotistic relationships in Twitter and their impact on rank prestige algorithms , 2010, Inf. Process. Manag..

[25]  Michael J. Muller,et al.  Motivations for social networking at work , 2008, CSCW.

[26]  Bertrand De Longueville,et al.  "OMG, from here, I can see the flames!": a use case of mining location based social networks to acquire spatio-temporal data on forest fires , 2009, LBSN '09.

[27]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[28]  Fang Wu,et al.  Social Networks that Matter: Twitter Under the Microscope , 2008, First Monday.

[29]  Iadh Ounis,et al.  Overview of the TREC 2008 Blog Track , 2008, TREC.

[30]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[31]  Rajeev Motwani,et al.  Incremental clustering and dynamic information retrieval , 1997, STOC '97.

[32]  Mark S. Ackerman,et al.  QuME: a mechanism to support expertise finding in online help-seeking communities , 2007, UIST.

[33]  Damon Horowitz,et al.  The anatomy of a large-scale social search engine , 2010, WWW '10.

[34]  Craig MacDonald,et al.  Overview of the TREC 2007 Blog Track , 2007, TREC.

[35]  Leysia Palen,et al.  Microblogging during two natural hazards events: what twitter may contribute to situational awareness , 2010, CHI.

[36]  Bernard J. Jansen,et al.  Twitter power: Tweets as electronic word of mouth , 2009, J. Assoc. Inf. Sci. Technol..

[37]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[38]  Jie Tang,et al.  Proceedings of the 2nd ACM workshop on Social web search and mining , 2009, CIKM 2009.

[39]  Hector Garcia-Molina,et al.  The Eigentrust algorithm for reputation management in P2P networks , 2003, WWW '03.

[40]  David A. Shamma,et al.  Tweet the debates: understanding community annotation of uncollected sources , 2009, WSM@MM.

[41]  Miles Efron,et al.  Hashtag retrieval in a microblogging environment , 2010, SIGIR.

[42]  Mary Beth Rosson,et al.  How and why people Twitter: the role that micro-blogging plays in informal communication at work , 2009, GROUP.

[43]  Maarten de Rijke,et al.  Finding experts and their eetails in e-mail corpora , 2006, WWW '06.

[44]  Ellen M. Voorhees,et al.  TREC: Continuing information retrieval's tradition of experimentation , 2007, CACM.

[45]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[46]  Rina Panigrahy,et al.  Better streaming algorithms for clustering problems , 2003, STOC '03.

[47]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[48]  Marc Cheong,et al.  Integrating web-based intelligence retrieval and decision-making from the twitter trends knowledge base , 2009, CIKM-SWSM.

[49]  David Hawking,et al.  Panoptic Expert: Searching for experts not just for documents , 2001 .

[50]  Craig MacDonald,et al.  Overview of the TREC 2006 Blog Track , 2006, TREC.

[51]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[52]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.