Using APIs for Data Collection on Social Media

This article discusses how social media research may benefit from social media companies making data available to researchers through their application programming interfaces (APIs). An API is a back-end interface through which third-party developers may connect new add-ons to an existing service. The API is also an interface for researchers to collect data off a given social media service for empirical analysis. Presenting a critical methodological discussion of the opportunities and challenges associated with quantitative and qualitative social media research based on APIs, this article highlights a number of general methodological issues to be dealt with when collecting and assessing data through APIs. The article further discusses the legal and ethical implications of empirical research using APIs for data collection.

[1]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[2]  Lada A. Adamic,et al.  The role of social networks in information diffusion , 2012, WWW.

[3]  Jure Leskovec,et al.  Worldwide Buzz: Planetary-Scale Views on an Instant-Messaging Network , 2007, WWW 2008.

[4]  David Mason,et al.  Digital Methods , 2014, Online Inf. Rev..

[5]  Michael S. Bernstein,et al.  Quantifying the invisible audience in social networks , 2013, CHI.

[6]  D. Karpf SOCIAL SCIENCE RESEARCH METHODS IN INTERNET TIME , 2012 .

[7]  Susan C. Herring,et al.  Beyond Microblogging: Conversation and Collaboration via Twitter , 2009, 2009 42nd Hawaii International Conference on System Sciences.

[8]  R. McChesney Communication Revolution: Critical Junctures and the Future of Media , 2007 .

[9]  Anja Bechmann,et al.  Mapping actor roles in social media: Different perspectives on value creation in theories of user participation , 2013, New Media Soc..

[10]  BechmannAnja,et al.  Using APIs for Data Collection on Social Media , 2014 .

[11]  Stine Lomborg,et al.  Researching Communicative Practice: Web Archiving in Qualitative Social Media Research , 2012 .

[12]  Susan C. Herring,et al.  Web Content Analysis: Expanding the Paradigm , 2009 .

[13]  C. Ess Digital Media Ethics , 2009 .

[14]  Fabio Giglietto,et al.  The Open Laboratory: Limits and Possibilities of Using Facebook, Twitter, and YouTube as a Research Data Source , 2012 .

[15]  Adam D. I. Kramer The spread of emotion via facebook , 2012, CHI.

[16]  Stine Lomborg,et al.  Personal internet archives and ethics , 2013 .

[17]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[18]  Mor Naaman,et al.  Is it really about me?: message content in social awareness streams , 2010, CSCW '10.

[19]  Duncan J. Watts,et al.  Who says what to whom on twitter , 2011, WWW.

[20]  Julian Ausserhofer,et al.  NATIONAL POLITICS ON TWITTER , 2013 .

[21]  D. Boyd,et al.  The Arab Spring| The Revolutions Were Tweeted: Information Flows during the 2011 Tunisian and Egyptian Revolutions , 2011 .

[22]  Lieven De Marez,et al.  Teenage Uploaders on YouTube: Networked Public Expectancies, Online Feedback Preference, and Received On-Platform Feedback , 2011, Cyberpsychology Behav. Soc. Netw..

[23]  Paul Ohm Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization , 2009 .

[24]  Huan Liu,et al.  Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose , 2013, ICWSM.

[25]  D. Boyd,et al.  CRITICAL QUESTIONS FOR BIG DATA , 2012 .

[26]  A. Bruns,et al.  RESEARCHING NEWS DISCUSSION ON TWITTER , 2012 .

[27]  Axel Bruns,et al.  Tools and methods for capturing Twitter data during natural disasters , 2012, First Monday.

[28]  Walter Daelemans,et al.  Pattern for Python , 2012, J. Mach. Learn. Res..

[29]  Merja Mahrt,et al.  The Value of Big Data in Digital Media Research , 2013 .

[30]  Steven G. Jones,et al.  Ethical Decision-Making and Internet Research: Recommendations from the AoIR Ethics Working Committee , 2004 .

[31]  Marco Rosa,et al.  Four degrees of separation , 2011, WebSci '12.

[32]  Ning Wang,et al.  Assessing the Bias in Communication Networks Sampled from Twitter , 2012, ArXiv.

[33]  Alessandro Acquisti,et al.  Silent Listeners: The Evolution of Privacy and Disclosure on Facebook , 2013, J. Priv. Confidentiality.

[34]  Cameron D. Palmer,et al.  Association Testing of Previously Reported Variants in a Large Case-Control Meta-analysis of Diabetic Nephropathy , 2011, Diabetes.

[35]  Niels Brügger,et al.  Web Archiving – between Past, Present, and Future , 2011 .

[36]  Helmut Leopold,et al.  Social Media , 2012, Elektrotech. Informationstechnik.

[37]  Sean P. Goggins,et al.  Twitter zombie: architecture for capturing, socially transforming and analyzing the twittersphere , 2012, GROUP.

[38]  Christian S. Jensen,et al.  Effective Privacy-Preserving Online Route Planning , 2011, 2011 IEEE 12th International Conference on Mobile Data Management.

[39]  Fang Wu,et al.  Social Networks that Matter: Twitter Under the Microscope , 2008, First Monday.

[40]  Christopher H. Sterling Communication Revolution: Critical Junctures and the Future of Media (review) , 2011 .

[41]  Ulf-Dietrich Reips,et al.  Mining twitter: A source for psychological wisdom of the crowds , 2011, Behavior research methods.

[42]  Ben Shneiderman,et al.  Analyzing Social Media Networks with NodeXL: Insights from a Connected World , 2010 .

[43]  Naomi S. Baron Always On: Language in an Online and Mobile World , 2008 .

[44]  Tyler J. Horan,et al.  ‘SOFT’ VERSUS ‘HARD’ NEWS ON MICROBLOGGING NETWORKS , 2013 .

[45]  C. Fuchs,et al.  Internet and Surveillance: The Challenges of Web 2.0 and Social Media , 2011 .

[46]  M. Zimmer “But the data is already public”: on the ethics of research in Facebook , 2010, Ethics and Information Technology.

[47]  Anja Bechmann,et al.  Non-Informed Consent Cultures: Privacy Policies and App Contracts on Facebook , 2014 .

[48]  José van Dijck,et al.  Users like you? Theorizing agency in user-generated content , 2009 .

[49]  Annette N. Markham,et al.  Ethical Decision-Making and Internet Research: Version 2.0 Recommendations from the AoIR Ethics Working Committee , 2012 .

[50]  Katrin Weller,et al.  Twitter for Scientific Communication: How Can Citations/References be Identified and Measured? , 2011 .

[51]  Fabian Neuhaus,et al.  AGILE ETHICS FOR MASSIFIED RESEARCH AND VISUALIZATION , 2012 .

[52]  Sameer Kumar Review of: Hansen, Derek, Shneiderman, Ben, and Smith, Marc A. Analyzing social media networks with NodeXL: insights from a connected world. Massachusetts: Morgan Kaufmann, 2010 , 2011, Inf. Res..