Creating Stories from Socially Curated Microblog Messages

SUMMARY Social media such as microblogs have become so pervasive such that it is now possible to use them as sensors for real-world events and memes. While much recent research has focused on developing automatic methods for filtering and summarizing these data streams, we explore a different trend called social curation. In contrast to automatic methods, social curation is characterized as a human-in-the-loop and sometimes crowd-sourced mechanism for exploiting social media as sensors. Although social curation web services like Togetter, Naver Matome and Storify are gaining popularity, little academic research has studied the phenomenon. In this paper, our goal is to investigate the phenomenon and potential of this new field of social curation. First, we perform an in-depth analysis of a large corpus of curated microblog data. We seek to understand why and how people participate in this laborious curation process. We then explore new ways in which information retrieval and machine learning technologies can be used to assist curators. In particular, we propose a novel method based on a learning-to-rank framework that increases the curator’s productivity and breadth of perspective by suggesting which novel microblogs should be added to the curated content.

[1]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[2]  Jason Weston,et al.  Protein Ranking by Semi-Supervised Network Propagation , 2006, BMC Bioinformatics.

[3]  Fergal Reid,et al.  Supporting the Curation of Twitter User Lists , 2011, ArXiv.

[4]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[5]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[6]  ZaragozaHugo,et al.  The Probabilistic Relevance Framework , 2009 .

[7]  Harry Shum,et al.  An Empirical Study on Learning to Rank of Tweets , 2010, COLING.

[8]  Brian D. Davison,et al.  Filtering microblogging messages for social tv , 2011, WWW.

[9]  Kevin Duh,et al.  Creating Stories: Social Curation of Twitter Messages , 2012, ICWSM.

[10]  Ben He,et al.  A Survey of Learning to Rank for Real-Time Twitter Search , 2012, ICPCA/SWS.

[11]  M. Osborne,et al.  Bieber no more : First Story Detection using Twitter and Wikipedia , 2012 .

[12]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[13]  Miles Osborne,et al.  Streaming First Story Detection with application to Twitter , 2010, NAACL.

[14]  Ee-Peng Lim,et al.  Finding Bursty Topics from Microblogs , 2012, ACL.

[15]  Kimura Akisato Social curation as corpora for large-scale multimedia content analysis , 2013 .

[16]  Susan C. Herring,et al.  Beyond Microblogging: Conversation and Collaboration via Twitter , 2009, 2009 42nd Hawaii International Conference on System Sciences.

[17]  Deepayan Chakrabarti,et al.  Event Summarization Using Tweets , 2011, ICWSM.

[18]  Michael S. Bernstein,et al.  Short and tweet: experiments on recommending content from information streams , 2010, CHI.

[19]  Susan T. Dumais,et al.  Characterizing Microblogs with Topic Models , 2010, ICWSM.

[20]  Miles Osborne,et al.  Using paraphrases for improving first story detection in news and Twitter , 2012, HLT-NAACL.

[21]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[22]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[23]  Bernard J. Jansen,et al.  Twitter power: Tweets as electronic word of mouth , 2009, J. Assoc. Inf. Sci. Technol..

[24]  Cong Wang,et al.  A survey on learning to rank , 2008, 2008 International Conference on Machine Learning and Cybernetics.

[25]  Hila Becker,et al.  Beyond Trending Topics: Real-World Event Identification on Twitter , 2011, ICWSM.

[26]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[27]  John Hannon,et al.  Recommending twitter users to follow using content and collaborative filtering approaches , 2010, RecSys '10.

[28]  Brian D. Davison,et al.  Empirical study of topic modeling in Twitter , 2010, SOMA '10.

[29]  Hakan Ferhatosmanoglu,et al.  Short text classification in twitter to improve information filtering , 2010, SIGIR.

[30]  Ido Guy,et al.  Personalized activity streams: sifting through the "river of news" , 2011, RecSys '11.