Up-to-the-Minute Privacy Policies via Gossips in Participatory Epidemiological Studies

Researchers and researched populations are actively involved in participatory epidemiology. Such studies collect many details about an individual. Recent developments in statistical inferences can lead to sensitive information leaks from seemingly insensitive data about individuals. Typical safeguarding mechanisms are vetted by ethics committees; however, the attack models are constantly evolving. Newly discovered threats, change in applicable laws or an individual's perception can raise concerns that affect the study. Addressing these concerns is imperative to maintain trust with the researched population. We are implementing Lohpi: an infrastructure for building accountability in data processing for participatory epidemiology. We address the challenge of data-ownership by allowing institutions to host data on their managed servers while being part of Lohpi. We update data access policies using gossips. We present Lohpi as a novel architecture for research data processing and evaluate the dissemination, overhead, and fault-tolerance.

[1]  D. Thelle,et al.  The Tromsøo heart study. Methods and main results of the cross-sectional study. , 2009, Acta medica Scandinavica.

[2]  J. Kelsey Privacy and confidentiality in epidemiological research involving patients. , 1981, IRB.

[3]  O H Førde,et al.  The Tromsø Study. Distribution and population determinants of gamma-glutamyltransferase. , 1990, American journal of epidemiology.

[4]  Donna L. Hoffman,et al.  Building consumer trust online , 1999, CACM.

[5]  Kenneth P. Birman,et al.  A gossip protocol for subgroup multicast , 2001, Proceedings 21st International Conference on Distributed Computing Systems Workshops.

[6]  Robbert van Renesse,et al.  The power of epidemics: robust communication for large-scale distributed systems , 2003, CCRV.

[7]  J. Svartberg,et al.  Waist Circumference and Testosterone Levels in Community Dwelling Men. The Tromsø Study , 2003, European Journal of Epidemiology.

[8]  Drummond Reed,et al.  OpenID 2.0: a platform for user-centric identity management , 2006, DIM '06.

[9]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[10]  Gary King,et al.  An Introduction to the Dataverse Network as an Infrastructure for Data Sharing , 2007 .

[11]  Ken Birman,et al.  The promise, and limitations, of gossip protocols , 2007, OPSR.

[12]  Mick P Couper,et al.  Risk of Disclosure, Perceptions of Risk, and Concerns about Privacy and Confidentiality as Factors in Survey Participation. , 2008, Journal of official statistics.

[13]  A. Mastroianni Sustaining Public Trust: Falling Short in the Protection of Human Research Participants , 2008, The Hastings Center report.

[14]  J. Hansen,et al.  Body height and risk of venous thromboembolism: The Tromsø Study. , 2010, American journal of epidemiology.

[15]  John A Lynch “Through a Glass Darkly”: Researcher Ethnocentrism and the Demonization of Research Participants , 2011, The American journal of bioethics : AJOB.

[16]  Yin Yang,et al.  Differential privacy in data publication and analysis , 2012, SIGMOD Conference.

[17]  Tom Wilsgaard,et al.  Cohort profile: The Tromsø Study , 2011, International journal of epidemiology.

[18]  Jens Meyer,et al.  Efficient data management in a large-scale epidemiology research project , 2012, Comput. Methods Programs Biomed..

[19]  Eric M. Meslin,et al.  Ethics, Information Technology, and Public Health: Duties and Challenges in Computational Epidemiology , 2014 .

[20]  Dag Johansen,et al.  Self-Managing Data in the Clouds , 2014, 2014 IEEE International Conference on Cloud Engineering.

[21]  Johannes Gehrke,et al.  Guardat: enforcing data policies at the storage layer , 2015, EuroSys.

[22]  Robbert van Renesse,et al.  Fireflies , 2015, ACM Trans. Comput. Syst..

[23]  Robbert van Renesse,et al.  Enforcing Privacy Policies with Meta-Code , 2015, APSys.

[24]  Bartha M Knoppers,et al.  Ethics, big data and computing in epidemiology and public health. , 2017, Annals of epidemiology.

[25]  A. Cheung Moving beyond Consent for Citizen Science in Big Data Health and Medical Research , 2017 .

[26]  F. Arnaud,et al.  From core referencing to data re-use: two French national initiatives to reinforce paleodata stewardship (National Cyber Core Repository and LTER France Retro-Observatory) , 2017 .

[27]  Jan-Eric Litton We must urgently clarify data-sharing rules , 2017, Nature.

[28]  Matthias Pocs,et al.  GDPR Compliance Challenges for Interoperable Health Information Exchanges (HIEs) and Trustworthy Research Environments (TREs) , 2018 .

[29]  E. V. Veen Observational health research in Europe: understanding the General Data Protection Regulation and underlying debate. , 2018 .

[30]  John Wilbanks,et al.  Responsible sharing of biomedical data and biospecimens via the “Automatable Discovery and Access Matrix” (ADA-M) , 2018, npj Genomic Medicine.

[31]  Simson L. Garfinkel,et al.  Issues Encountered Deploying Differential Privacy , 2018, WPES@CCS.

[32]  Fabian Prasser,et al.  Enhancing Reuse of Data and Biological Material in Medical Research: From FAIR to FAIR-Health , 2018, Biopreservation and biobanking.

[33]  Michel Dumontier,et al.  LUCE: A Blockchain Solution for monitoring data License accoUntability and CompliancE , 2019, ArXiv.

[34]  Lucila Ohno-Machado,et al.  Discovering Data Access and Use Requirements Using the Data Tags Suite (DATS) Model1 , 2019, bioRxiv.

[35]  Paul Laskowski,et al.  Privacy and Policy in Polystores: A Data Management Research Agenda , 2019, Poly/DMAH@VLDB.

[36]  Tristan Henderson,et al.  Automating Dynamic Consent Decisions for the Processing of Social Media Data in Health Research , 2019, Journal of empirical research on human research ethics : JERHRE.

[37]  Giulia Schneider Disentangling health data networks: a critical analysis of Articles 9(2) and 89 GDPR , 2019 .

[38]  Ben Shneiderman,et al.  Bridging the Gap Between Ethics and Practice , 2020, ACM Trans. Interact. Intell. Syst..

[39]  David Durfee,et al.  Individual Sensitivity Preprocessing for Data Privacy , 2018, SODA.

[40]  H. Johansen,et al.  Privacy Perceptions and Concerns in Image-Based Dietary Assessment Systems: Questionnaire-Based Study , 2020, JMIR human factors.

[41]  C. Petersen,et al.  From Commercialization to Accountability: Responsible Health Data Collection, Use, and Disclosure for the 21st Century , 2020, Applied Clinical Informatics.

[42]  Zanda Dāvida Consumer Rights and Personalised Advertising: Risk of Exploiting Consumer Vulnerabilities , 2020 .

[43]  Vijay Chidambaram,et al.  Towards Software-Defined Data Protection: GDPR Compliance at the Storage Layer is Within Reach , 2020, ArXiv.

[44]  Marzyeh Ghassemi,et al.  Chasing Your Long Tails: Differentially Private Prediction in Health Care Settings , 2020, FAccT.