Beyond the EULA: Improving consent for data mining

Companies and academic researchers may collect, process, and distribute large quantities of personal data without the explicit knowledge or consent of the individuals to whom the data pertains. Existing forms of consent often fail to be appropriately readable and ethical oversight of data mining may not be sufficient. This raises the question of whether existing consent instruments are sufficient, logistically feasible, or even necessary, for data mining. In this chapter, we review the data collection and mining landscape, including commercial and academic activities, and the relevant data protection concerns, to determine the types of consent instruments used. Using three case studies, we use the new paradigm of human-data interaction to examine whether these existing approaches are appropriate. We then introduce an approach to consent that has been empirically demonstrated to improve on the state of the art and deliver meaningful consent. Finally, we propose some best practices for data collectors to ensure their data mining activities do not violate the expectations of the people to whom the data relate.

[1]  Daniel J. Solove,et al.  Privacy Self-Management and the Consent Dilemma , 2012 .

[2]  David Lund,et al.  Dynamic Consent: A Possible Solution to Improve Patient Confidence and Trust in How Electronic Patient Records Are Used in Medical Research , 2015, JMIR medical informatics.

[3]  Jeffrey T. Hancock,et al.  Experimental evidence of massive-scale emotional contagion through social networks , 2014, Proceedings of the National Academy of Sciences.

[4]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[5]  Timothy Recuber From obedience to contagion: Discourses of power in Milgram, Zimbardo, and the Facebook experiment , 2016 .

[6]  Lorrie Faith Cranor,et al.  The post that wasn't: exploring self-censorship on facebook , 2013, CSCW.

[7]  Susan B. Barnes,et al.  A privacy paradox: Social networking in the United States , 2006, First Monday.

[8]  Tom Rodden,et al.  Consent for all: revealing the hidden complexity of terms and conditions , 2013, CHI.

[9]  Aleecia M. McDonald,et al.  The Cost of Reading Privacy Policies , 2009 .

[10]  Colin Tankard,et al.  What the GDPR means for businesses , 2016, Netw. Secur..

[11]  J. Franklin,et al.  The elements of statistical learning: data mining, inference and prediction , 2005 .

[12]  J. Reeve,et al.  Solutions to problematic polypharmacy: learning from the expertise of patients. , 2015, The British journal of general practice : the journal of the Royal College of General Practitioners.

[13]  Eran Toch,et al.  Retrospective privacy: managing longitudinal privacy in online social networks , 2013, SOUPS.

[14]  K. Steinsbekk,et al.  Broad consent versus dynamic consent in biobank research: Is passive participation an ethical problem? , 2013, European Journal of Human Genetics.

[15]  Oliver Hinz,et al.  The value of user’s Facebook profile data for product recommendation generation , 2015, Electron. Mark..

[16]  Tristan Henderson,et al.  "I Didn't Sign Up for This!": Informed Consent in Social Network Research , 2015, ICWSM.

[17]  Douwe Korff,et al.  Using NHS Patient Data for Research Without Consent , 2010 .

[18]  Trey D. Guinn,et al.  Sources of patient uncertainty when reviewing medical disclosure and consent documentation. , 2013, Patient education and counseling.

[19]  Joel M. Hektner,et al.  Experience sampling method , 2007 .

[20]  Nicu Sebe,et al.  Money walks: a human-centric study on the economics of personal mobile data , 2014, UbiComp.

[21]  R. G. Lonsdorf Informed consent: Legal theory and clinical practice , 1988 .

[22]  Steffen Staab,et al.  Data Mining and Automated Discrimination: A Mixed Legal/Technical Perspective , 2016, IEEE Intelligent Systems.

[23]  Michael Morrison,et al.  Dynamic consent: a patient interface for twenty-first century research networks , 2014, European Journal of Human Genetics.

[24]  J. van Leeuwen,et al.  Active Media Technology , 2001, Lecture Notes in Computer Science.

[25]  Philip M. Napoli Social media and the public interest_ Governance of news platforms in the realm of individual and algorithmic gatekeepers , 2015 .

[26]  Marco Gonzalez,et al.  Author's Personal Copy Social Networks Tastes, Ties, and Time: a New Social Network Dataset Using Facebook.com , 2022 .

[27]  Tom Rodden,et al.  An informed view on consent for UbiComp , 2013, UbiComp.

[28]  Tom Rodden,et al.  Exploring Patterns as a Framework for Embedding Consent Mechanisms in Human-Agent Collectives , 2014, AMT.

[29]  Hamed Haddadi,et al.  Human-Data Interaction , 2016 .

[30]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[31]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[32]  Helen Nissenbaum,et al.  Privacy in Context - Technology, Policy, and the Integrity of Social Life , 2009 .

[33]  Pamela Sankar,et al.  Patient perspectives of medical confidentiality: a review of the literature. , 2003, Journal of general internal medicine.

[34]  Blase Ur,et al.  The post anachronism: the temporal dimension of facebook privacy , 2013, WPES.

[35]  A. Akkad,et al.  Patients' perceptions of written consent: questionnaire study , 2006, BMJ : British Medical Journal.

[36]  Gerhard Steinke,et al.  Data privacy approaches from US and EU perspectives , 2002, Telematics Informatics.

[37]  G. F. Judisch,et al.  Informed Consent. Legal Theory and Clinical Practice , 1988 .

[38]  Deborah Estrin,et al.  Self-Surveillance Privacy , 2010 .

[39]  B. Hamnes,et al.  Readability of patient information and consent documents in rheumatological studies , 2016, BMC medical ethics.

[40]  M. Zimmer “But the data is already public”: on the ethics of research in Facebook , 2010, Ethics and Information Technology.

[41]  Molly C. Jackman,et al.  Evolving the IRB: Building Robust Review for Industry Research , 2016 .

[42]  Franklin G. Miller,et al.  Preface to a Theory of Consent Transactions: Beyond Valid Consent , 2009 .

[43]  A. Borovečki,et al.  Readability and Content Assessment of Informed Consent Forms for Medical Procedures in Croatia , 2015, PloS one.

[44]  Evan Selinger,et al.  Facebook’s emotional contagion study and the ethical problem of co-opted identity in mediated environments where users lack control , 2016 .

[45]  Philip M. Napoli Social media and the public interest , 2015 .

[46]  Monica M. C. Schraefel,et al.  Consenting agents: semi-autonomous interactions for ubiquitous consent , 2014, UbiComp Adjunct.

[47]  H. Hodson Google knows your ills , 2016 .

[48]  Katie Shilton,et al.  Beyond the Belmont Principles: Ethical Challenges, Practices, and Beliefs in the Online Data Research Community , 2016, CSCW.

[49]  Mary Beth Rosson,et al.  journal homepage: www.elsevier.com/locate/ecra Privacy as information access and illusory control: The case of the Facebook News Feed privacy outcry , 2022 .

[50]  Matthew Chalmers,et al.  Improving consent in large scale mobile HCI through personalised representations of data , 2014, NordiCHI.

[51]  Mario Romero,et al.  Situational Ethics: Re-thinking Approaches to Formal Ethics Requirements for Human-Computer Interaction , 2015, CHI.

[52]  M. Csíkszentmihályi,et al.  Experience Sampling Method: Measuring the Quality of Everyday Life , 2006 .