Ethical research standards in a world of big data

In 2009 Ginsberg et al. reported using Google search query volume to estimate influenza activity in advance of traditional methodologies. It was a groundbreaking example of digital disease detection, and it still remains illustrative of the power of gathering data from the internet for important research. In recent years, the methodologies have been extended to include new topics and data sources; Twitter in particular has been used for surveillance of influenza-like-illnesses, political sentiments, and even behavioral risk factors like sentiments about childhood vaccination programs. As the research landscape continuously changes, the protection of human subjects in online research needs to keep pace. Here we propose a number of guidelines for ensuring that the work done by digital researchers is supported by ethical-use principles. Our proposed guidelines include: 1) Study designs using Twitter-derived data should be transparent and readily available to the public. 2) The context in which a tweet is sent should be respected by researchers. 3) All data that could be used to identify tweet authors, including geolocations, should be secured. 4) No information collected from Twitter should be used to procure more data about tweet authors from other sources. 5) Study designs that require data collection from a few individuals rather than aggregate analysis require Institutional Review Board (IRB) approval. 6) Researchers should adhere to a user’s attempt to control his or her data by respecting privacy settings. As researchers, we believe that a discourse within the research community is needed to ensure protection of research subjects. These guidelines are offered to help start this discourse and to lay the foundations for the ethical use of Twitter data.

[1]  B. Edgell The British Psychological Society. , 1947, The British journal of psychology. General section.

[2]  G. Marx Murky conceptual waters: The public and the private , 2001, Ethics and Information Technology.

[3]  Elizabeth H. Bassett,et al.  Ethics of Internet research: Contesting the human subjects research model , 2002, Ethics and Information Technology.

[4]  H. Nissenbaum Privacy as contextual integrity , 2004 .

[5]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[6]  D. Christakis,et al.  Research Ethics in the MySpace Era , 2008, Pediatrics.

[7]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[8]  Charles Ess,et al.  Internet Research Ethics: The Field and Its Critical Issues , 2009 .

[9]  Joseph Bonneau,et al.  The Privacy Jungle: On the Market for Data Protection in Social Networks , 2009, WEIS.

[10]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[11]  Rebecca Eynon,et al.  New techniques in online research: challenges for research ethics , 2009 .

[12]  Nello Cristianini,et al.  Tracking the flu pandemic by monitoring the social web , 2010, 2010 2nd International Workshop on Cognitive Information Processing.

[13]  Lauren B Solberg Data Mining on Facebook: A Free Space for Researchers or an IRB Nightmare? , 2010 .

[14]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[15]  M. Zimmer “But the data is already public”: on the ethics of research in Facebook , 2010, Ethics and Information Technology.

[16]  Celine Latulipe,et al.  Contextual gaps: privacy issues on Facebook , 2009, Ethics and Information Technology.

[17]  Alexander Halavais Social science: Open up online research , 2011, Nature.

[18]  M. Thelwall,et al.  Researching Personal Information on the Public Web , 2011 .

[19]  D. Dittrich,et al.  Computer Science Security Research and Human Subjects: Emerging Considerations for Research Ethics Boards , 2011, Journal of Empirical Research on Human Research Ethics.

[20]  Benyuan Liu,et al.  Predicting Flu Trends using Twitter data , 2011, 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[21]  W. Douglas Maughan,et al.  The Menlo Report , 2012, IEEE Security & Privacy.

[22]  Fabian Neuhaus,et al.  AGILE ETHICS FOR MASSIFIED RESEARCH AND VISUALIZATION , 2012 .

[23]  D. Boyd,et al.  CRITICAL QUESTIONS FOR BIG DATA , 2012 .

[24]  B. Huberman Sociology of science: Big data deserve a bigger audience , 2012, Nature.

[25]  Kate Thomson Ethics of Twitter Research , 2012 .

[26]  D. Dittrich,et al.  The Menlo Report: Ethical Principles Guiding Information and Communication Technology Research , 2012 .

[27]  Tristan Henderson,et al.  Understanding ethical concerns in social media privacy studies , 2013 .

[28]  John P A Ioannidis,et al.  Informed Consent, Big Data, and the Oxymoron of Research That Is Not Research , 2013, The American journal of bioethics : AJOB.

[29]  Tristan Henderson,et al.  An architecture for ethical and privacy-sensitive social network experiments , 2013, PERV.

[30]  Marcel Salathé,et al.  Complex social contagion makes networks more vulnerable to disease outbreaks , 2012, Scientific Reports.

[31]  R. Mckee Ethical issues in using social media for health and health care research. , 2013, Health policy.

[32]  Yunan Chen,et al.  Using contextual integrity to examine interpersonal information boundary on social network sites , 2013, CHI.

[33]  A. Anonymous,et al.  Consumer Data Privacy in a Networked World: A Framework for Protecting Privacy and Promoting Innovation in the Global Digital Economy , 2013, J. Priv. Confidentiality.

[34]  P. Ossorio,et al.  Regulation of Online Social Network Studies , 2013, Science.

[35]  J. Burgess,et al.  The Politics of Twitter Data , 2013 .

[36]  Ethical Issues in Internet Research: International Good Practice and Irish Research Ethics Documents , 2013 .

[37]  T. Buchanan,et al.  Ethics Guidelines for Internet-mediated Research , 2013 .

[38]  D. Zucker The Belmont Report , 2014 .

[39]  B. Lewis,et al.  Methods of using real-time social media technologies for detection and remote monitoring of HIV outcomes. , 2014, Preventive medicine.

[40]  A. Schuchat DEPARTMENT OF HEALTH & HUMAN SERVICES , 2015 .