Online And Social Media Data As A Flawed Continuous Panel Survey

There is a large body of research on utilizing online activity to predict various real world outcomes, ranging from outbreaks of influenza to outcomes of elections. There is considerably less work, however, on using this data to understand topic-specific interest and opinion amongst the general population and specific demographic subgroups, as currently measured by relatively expensive surveys. Here we investigate this possibility by studying a full census of all Twitter activity during the 2012 election cycle along with comprehensive search history of a large panel of internet users during the same period, highlighting the challenges in interpreting online and social media activity as the results of a survey. As noted in existing work, the online population is a non-representative sample of the offline world (e.g., the U.S. voting population). We extend this work to show how demographic skew and user participation is non-stationary and unpredictable over time. In addition, the nature of user contributions varies wildly around important events. Finally, we note subtle problems in mapping what people are sharing or consuming online to specific sentiment or opinion measures around a particular topic. These issues must be addressed before meaningful insight about public interest and opinion can be reliably extracted from online and social media data. This draft: May 15, 2014 Latest draft: http://research.microsoft.com/en-US/projects/flawedsurvey

[1]  Amit P. Sheth,et al.  Are Twitter Users Equal in Predicting Elections? A Study of User Groups in Predicting 2012 U.S. Republican Presidential Primaries , 2012, SocInfo.

[2]  Eni Mustafaraj,et al.  On the predictability of the U.S. elections through search volume activity , 2011 .

[3]  Murphy Choy,et al.  US Presidential Election 2012 Prediction using Census Corrected Twitter Model , 2012, ArXiv.

[4]  A. Smeaton,et al.  On Using Twitter to Monitor Political Sentiment and Predict Election Results , 2011 .

[5]  Ana-Maria Popescu,et al.  A Machine Learning Approach to Twitter User Classification , 2011, ICWSM.

[6]  Duncan J. Watts,et al.  Who says what to whom on twitter , 2011, WWW.

[7]  Munmun De Choudhury,et al.  Not All Moods Are Created Equal! Exploring Human Emotional States in Social Media , 2012, ICWSM.

[8]  David M. Rothschild,et al.  Forecasting elections with non-representative polls , 2015 .

[9]  Emre Kiciman,et al.  OMG, I Have to Tweet that! A Study of Factors that Influence Tweet Rates , 2012, ICWSM.

[10]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[11]  Fang Wu,et al.  Social Networks that Matter: Twitter Under the Microscope , 2008, First Monday.

[12]  Fernando Diaz,et al.  Integration of news content into web results , 2009, WSDM '09.

[13]  Ee-Peng Lim,et al.  Tweets and Votes: A Study of the 2011 Singapore General Election , 2012, 2012 45th Hawaii International Conference on System Sciences.

[14]  Christopher M. Danforth,et al.  Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter , 2011, PloS one.

[15]  Narseo Vallina-Rodriguez,et al.  Los Twindignados: The Rise of the Indignados Movement on Twitter , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[16]  Jacob Ratkiewicz,et al.  Truthy: mapping the spread of astroturf in microblog streams , 2010, WWW.

[17]  Omar Alonso,et al.  Exploiting entities in social media , 2013, ESAIR '13.

[18]  Scott A. Golder,et al.  Diurnal and Seasonal Mood Vary with Work, Sleep, and Daylength Across Diverse Cultures , 2011 .

[19]  Alessandro Rozza,et al.  Modelling political disaffection from Twitter data , 2013, WISDOM '13.

[20]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[21]  Jisue Lee,et al.  Citizens' Use of Twitter in Political Information Sharing in South Korea , 2013 .

[22]  Antoine Boutet,et al.  What's in Your Tweets? I Know Who You Supported in the UK 2010 General Election , 2012, ICWSM.

[23]  Elad Yom-Tov,et al.  The Effect of Social and Physical Detachment on Information Need , 2013, ACM Trans. Inf. Syst..

[24]  Daniel Gayo-Avello,et al.  "I Wanted to Predict Elections with Twitter and all I got was this Lousy Paper" - A Balanced Survey on Election Prediction using Twitter Data , 2012, ArXiv.

[25]  Declan Butler,et al.  When Google got flu wrong , 2013, Nature.

[26]  T. Graepel,et al.  Private traits and attributes are predictable from digital records of human behavior , 2013, Proceedings of the National Academy of Sciences.

[27]  Shilpa Shukla,et al.  On Classifying the Political Sentiment of Tweets , 2011 .

[28]  J. Bollen,et al.  More Tweets, More Votes: Social Media as a Quantitative Indicator of Political Behavior , 2013, PloS one.

[29]  David M. Pennock,et al.  Predicting consumer behavior with Web search , 2010, Proceedings of the National Academy of Sciences.

[30]  Cameron Marlow,et al.  A 61-million-person experiment in social influence and political mobilization , 2012, Nature.

[31]  Ben Sayre,et al.  Mapping the Political Twitterverse: Candidates and Their Followers in the Midterms , 2021, ICWSM.

[32]  Giuseppe Porro,et al.  Every tweet counts? How sentiment analysis of social media can improve our knowledge of citizens’ political preferences with an application to Italy and France , 2013, New Media Soc..

[33]  Kam-Fai Wong,et al.  Quantising Opinions for Political Tweets Analysis , 2012, LREC.

[34]  H. Varian,et al.  Predicting the Present with Google Trends , 2009 .

[35]  Huan Liu,et al.  Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose , 2013, ICWSM.

[36]  A. J. Morales,et al.  Characterizing and modeling an electoral campaign in the context of Twitter: 2011 Spanish Presidential Election as a case study , 2012, Chaos.

[37]  Panagiotis Takis Metaxas,et al.  What Edited Retweets Reveal about Online Political Discourse , 2011, Analyzing Microtext.

[38]  Sharad Goel,et al.  Who Does What on the Web: A Large-Scale Study of Browsing Behavior , 2012, ICWSM.

[39]  Lei Shi,et al.  Predicting US Primary Elections with Twitter , 2012 .

[40]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[41]  Fernando Cuartero,et al.  Twitter as a Tool for Predicting Elections Results , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[42]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[43]  P. Metaxas,et al.  Social Media and the Elections , 2012, Science.

[44]  Anders Olof Larsson,et al.  Methodological and Ethical Challenges Associated with Large-scale Analyses of Online Political Communication , 2013 .

[45]  Daniel Gayo-Avello,et al.  Don't turn social media into another 'Literary Digest' poll , 2011, Commun. ACM.

[46]  Filippo Menczer,et al.  The Digital Evolution of Occupy Wall Street , 2013, PloS one.

[47]  Lars Backstrom,et al.  ePluribus: Ethnicity on Social Networks , 2010, ICWSM.

[48]  Xiao Li,et al.  Learning query intent from regularized click graphs , 2008, SIGIR '08.

[49]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[50]  Munmun De Choudhury,et al.  Happy, Nervous or Surprised? Classification of Human Affective States in Social Media , 2012, ICWSM.

[51]  Keith W. Ross,et al.  What's in a Name: A Study of Names, Gender Inference, and Gender Behavior in Facebook , 2011, DASFAA Workshops.

[52]  Wendy Liu,et al.  Homophily and Latent Attribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors , 2012, ICWSM.

[53]  JungherrAndreas,et al.  Why the Pirate Party Won the German Election of 2009 or The Trouble With Predictions , 2012 .

[54]  Susan T. Dumais,et al.  Towards Supporting Search over Trending Events with Social Media , 2013, ICWSM.

[55]  Panagiotis Takis Metaxas,et al.  Limits of Electoral Predictions Using Twitter , 2011, ICWSM.