Assessing the Bias in Communication Networks Sampled from Twitter

We collect and analyse messages exchanged in Twitter using two of the platform's publicly available APIs (the search and stream specifications). We assess the differences between the two samples, and compare the networks of communication reconstructed from them. The empirical context is given by political protests taking place in May 2012: we track online communication around these protests for the period of one month, and reconstruct the network of mentions and re-tweets according to the two samples. We find that the search API over-represents the more central users and does not offer an accurate picture of peripheral activity; we also find that the bias is greater for the network of mentions. We discuss the implications of this bias for the study of diffusion dynamics and collective action in the digital era, and advocate the need for more uniform sampling procedures in the study of online communication.

[1]  Marshall Van Alstyne,et al.  The Diversity-Bandwidth Tradeoff , 2010 .

[2]  Duncan J. Watts,et al.  Who says what to whom on twitter , 2011, WWW.

[3]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[4]  H. Farrell The Consequences of the Internet for Politics , 2012 .

[5]  Joachim Mathiesen,et al.  Modular networks of word correlations on Twitter , 2011, Scientific Reports.

[6]  Devin Gaffney #iranElection: quantifying online activism , 2010 .

[7]  G. Marwell,et al.  The critical mass in collective action , 1993 .

[8]  Stephen R. Barnard,et al.  Digitally Enabled Social Change: Activism in the Internet Age , 2012, New Media Soc..

[9]  Doug McAdam,et al.  Specifying the Relationship Between Social Ties and Activism , 1993, American Journal of Sociology.

[10]  Doug McAdam Recruitment to High-Risk Activism: The Case of Freedom Summer , 1986, American Journal of Sociology.

[11]  D. Boyd,et al.  Dynamic Debates: An Analysis of Group Polarization Over Time on Twitter , 2010 .

[12]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[13]  Mike Thelwall,et al.  Twitter, MySpace, Digg: Unsupervised Sentiment Analysis in Social Media , 2012, TIST.

[14]  Yamir Moreno,et al.  Structural and Dynamical Patterns on Online Social Networks: The Spanish May 15th Movement as a Case Study , 2011, PloS one.

[15]  Martin G. Everett,et al.  Models of core/periphery structures , 2000, Soc. Networks.

[16]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[17]  A. Lupia,et al.  Which Public Goods are Endangered?: How Evolving Communication Technologies Affect The Logic of Collective Action , 2003 .

[18]  Susan C. Herring,et al.  Beyond Microblogging: Conversation and Collaboration via Twitter , 2009, 2009 42nd Hawaii International Conference on System Sciences.

[19]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[20]  Jon Kleinberg,et al.  Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter , 2011, WWW.

[21]  Mario Diani,et al.  Social Movements and Networks , 2003 .

[22]  Barry Wellman,et al.  Geography of Twitter networks , 2012, Soc. Networks.

[23]  Jacob Ratkiewicz,et al.  Political Polarization on Twitter , 2011, ICWSM.

[24]  Yamir Moreno,et al.  Broadcasters and Hidden Influentials in Online Protest Diffusion , 2012, ArXiv.

[25]  Esteban Moro,et al.  Social Features of Online Networks: The Strength of Intermediary Ties in Online Social Media , 2011, PloS one.

[26]  Yamir Moreno,et al.  The Dynamics of Protest Recruitment through an Online Network , 2011, Scientific reports.

[27]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[28]  Bernard J. Jansen,et al.  Twitter power: Tweets as electronic word of mouth , 2009, J. Assoc. Inf. Sci. Technol..

[29]  Rizal Setya Perdana What is Twitter , 2013 .

[30]  Christopher M. Danforth,et al.  Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter , 2011, PloS one.

[31]  M. Olson,et al.  The Logic of Collective Action , 1965 .

[32]  Alessandro Vespignani,et al.  Modeling Users' Activity on Twitter Networks: Validation of Dunbar's Number , 2011, PloS one.

[33]  Mario Diani,et al.  Networks and Social Movements: A Research Programme , 2003 .

[34]  Duncan J. Watts,et al.  Everyone's an influencer: quantifying influence on twitter , 2011, WSDM '11.

[35]  Fang Wu,et al.  Social Networks that Matter: Twitter Under the Microscope , 2008, First Monday.

[36]  Danah Boyd,et al.  Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter , 2010, 2010 43rd Hawaii International Conference on System Sciences.

[37]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[38]  Carter T. Butts,et al.  Change and External Events in Computer-Mediated Citation Networks: English Language Weblogs and the 2004 U.S. Electoral Cycle* , 2009, J. Soc. Struct..

[39]  Daniele Quercia,et al.  The Social World of Twitter: Topics, Geography, and Emotions , 2012, ICWSM.

[40]  Andrew J. Flanagin,et al.  Reconceptualizing Collective Action in the Contemporary Media Environment , 2005 .

[41]  P. Bonacich Power and Centrality: A Family of Measures , 1987, American Journal of Sociology.

[42]  Susan C. Herring,et al.  Beyond Microblogging: Conversation and Collaboration via Twitter , 2009 .

[43]  Krishna P. Gummadi,et al.  The World of Connections and Information Flow in Twitter , 2012, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[44]  Doug McAdam,et al.  Social Movements and Networks: Relational Approaches to Collective Action , 2003 .

[45]  Ciro Cattuto,et al.  Dynamical classes of collective attention in twitter , 2011, WWW.