Toward Reproducibility in Online Social Network Research

The challenge of conducting reproducible computational research is acknowledged across myriad disciplines from biology to computer science. In the latter, research leveraging online social networks (OSNs) must deal with a set of complex issues, such as ensuring data can be collected in an appropriate and reproducible manner. Making research reproducible is difficult, and researchers may need suitable incentives, and tools and systems, to do so. In this paper, we explore the state-of-the-art in OSN research reproducibility, and present an architecture to aid reproducibility. We characterize the reproducible OSN research using three main themes: 1) reporting of methods; 2) availability of code; and 3) sharing of research data. We survey 505 papers and assess the extent to which they achieve these reproducibility objectives. While systems-oriented papers are more likely to explain data-handling aspects of their methodology, social science papers are better at describing their participant-handling procedures. We then examine incentives to make research reproducible, by conducting a citation analysis of these papers. We find that sharing data are associated with increased citation count, while sharing method and code does not appear to be. Finally, we introduce our architecture which supports the conduct of reproducible OSN research, which we evaluate by replicating an existing research study.

[1]  N. Ellison,et al.  Social capital, self-esteem, and use of online social network sites: A longitudinal analysis , 2008 .

[2]  Victoria Stodden,et al.  RunMyCode.org: A Research-Reproducibility Tool for Computational Sciences , 2018, Implementing Reproducible Research.

[3]  Tim De Feyter,et al.  Facebook: A literature review , 2013, New Media Soc..

[4]  Arian Maleki,et al.  Reproducible Research in Computational Harmonic Analysis , 2009, Computing in Science & Engineering.

[5]  Tristan Henderson,et al.  Understanding ethical concerns in social media privacy studies , 2013 .

[6]  Victoria Stodden,et al.  Reproducible Research , 2019, The New Statistics with R.

[7]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[8]  Victoria Stodden,et al.  Best Practices for Computational Science: Software Infrastructure and Environments for Reproducible and Extensible Research , 2014 .

[9]  Victoria Stodden,et al.  The Scientific Method in Practice: Reproducibility in the Computational Sciences , 2010 .

[10]  Lauren B Solberg Data Mining on Facebook: A Free Space for Researchers or an IRB Nightmare? , 2010 .

[11]  Abigail B Shoben,et al.  Does Consent Bias Research? , 2013, The American journal of bioethics : AJOB.

[12]  Ian P. Gent The Recomputation Manifesto , 2013, ArXiv.

[13]  Tristan Henderson,et al.  An architecture for ethical and privacy-sensitive social network experiments , 2013, PERV.

[14]  Filippo Trevisan,et al.  Ethical dilemmas in researching sensitive issues online: lessons from the study of British disability dissent networks , 2014 .

[15]  Heather A. Piwowar,et al.  Sharing Detailed Research Data Is Associated with Increased Citation Rate , 2007, PloS one.

[16]  Ninghui Li,et al.  End-User Privacy in Human–Computer Interaction , 2009 .

[17]  Matthew T. Mullarkey Socially immature organizations: a typology of social networking systems [SNS] with organizations as users [OAU] , 2012, CSCW.

[18]  Julie Evans,et al.  Model Formulation: The BRIDG Project: A Technical Report , 2008, J. Am. Medical Informatics Assoc..

[19]  Scott A. Golder,et al.  Digital Footprints: Opportunities and Challenges for Online Social Research , 2014 .

[20]  Lindsay T. Graham,et al.  A Review of Facebook Research in the Social Sciences , 2012, Perspectives on psychological science : a journal of the Association for Psychological Science.

[21]  Henderson Tristan,et al.  Data for the paper "Towards reproducibility in online social network research" , 2015 .

[22]  Eric Horvitz,et al.  Social media as a measurement tool of depression in populations , 2013, WebSci.

[23]  Anton Nekrutenko,et al.  Ten Simple Rules for Reproducible Computational Research , 2013, PLoS Comput. Biol..

[24]  Cláudio T. Silva,et al.  Reproducibility using VisTrails , 2014 .

[25]  Petra Saskia Bayerl,et al.  Social media and the police: tweeting practices of british police forces during the August 2011 riots , 2013, CHI.

[26]  Sophia Alim,et al.  An initial exploration of ethical research practices regarding automated data extraction from online social media user profiles , 2014, First Monday.

[27]  Peter B. McGarvey,et al.  Infrastructure for the life sciences: design and implementation of the UniProt website , 2009, BMC Bioinformatics.

[28]  Jeffrey Nichols,et al.  Asking questions of targeted strangers on social networks , 2012, CSCW '12.

[29]  Mary Beth Rosson,et al.  journal homepage: www.elsevier.com/locate/ecra Privacy as information access and illusory control: The case of the Facebook News Feed privacy outcry , 2022 .

[30]  Phillip Dawson,et al.  Our anonymous online research participants are not always anonymous: Is this a problem? , 2014, Br. J. Educ. Technol..

[31]  Steven M. Bellovin,et al.  Facebook and privacy: it's complicated , 2012, SOUPS.

[32]  Tim Paek,et al.  Sampling representative phrase sets for text entry experiments: a procedure and public resource , 2011, CHI.

[33]  Danah Boyd,et al.  Social Network Sites: Definition, History, and Scholarship , 2007, J. Comput. Mediat. Commun..

[35]  Carole A. Goble,et al.  myExperiment: a repository and social network for the sharing of bioinformatics workflows , 2010, Nucleic Acids Res..

[36]  Rob Procter,et al.  Issues for the sharing and re-use of scientific workflows , 2009 .

[37]  Linda Butler,et al.  Using a balanced approach to bibliometrics: quantitative performance measures in the Australian Research Quality Framework , 2008 .

[38]  Huan Liu,et al.  Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose , 2013, ICWSM.

[39]  Abdallah El Ali,et al.  Photographer paths: sequence alignment of geotagged photos for exploration-based route planning , 2013, CSCW.

[40]  Jennifer King,et al.  Privacy: is there an app for that? , 2011, SOUPS.