Systematizing Genomic Privacy Research - A Critical Analysis

Rapid advances in human genomics are enabling life science researchers to gain a better understanding of the role of the variation in our ancestry, health, and well-being, which stimulates hope for more cost efficient and effective healthcare. However, this progress also yields a number of security and privacy concerns, stemming from the distinctive characteristics of genomic data. Aiming to address them, a new research community has emerged, producing a large number of publications and initiatives. In this paper, we introduce and execute a structured methodology to systematize the current knowledge around genome privacy research, focusing on privacy-enhancing technologies used in the context of testing, storing, and sharing genomic data, while selecting a representative sample of the community's work. Using carefully crafted systematization criteria, we provide and discuss critical viewpoints and a comprehensive analysis on the timeliness and the relevance of the work produced by the community. In doing so, we highlight that proposed technologies can only offer protection in the short-term, scrutinizing assumptions made by the community, and analyzing the costs introduced by privacy defenses in terms of various types of utility and flexibility overhead.

[1]  Murat Kantarcioglu,et al.  Expanding Access to Large-Scale Genomic Data While Promoting Privacy: A Game Theoretic Approach. , 2017, American journal of human genetics.

[2]  TU Dresden mhaehnel High-Resolution Side Channels for Untrusted Operating Systems , 2017 .

[3]  Stephen E. Fienberg,et al.  Privacy Preserving GWAS Data Sharing , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[4]  L. Vissers,et al.  Genome sequencing identifies major causes of severe intellectual disability , 2014, Nature.

[5]  R. Redon,et al.  Relative Impact of Nucleotide and Copy Number Variation on Gene Expression Phenotypes , 2007, Science.

[6]  Ninghui Li,et al.  Membership privacy: a unifying framework for privacy definitions , 2013, CCS.

[7]  Emiliano De Cristofaro,et al.  Secure genomic testing with size- and position-hiding private substring matching , 2013, WPES.

[8]  Michael Krawczak,et al.  GrabBlur - a framework to facilitate the secure exchange of whole-exome and -genome SNV data using VCF files , 2014, BMC Genomics.

[9]  Kristin E. Lauter,et al.  Private genome analysis through homomorphic encryption , 2015, BMC Medical Informatics and Decision Making.

[10]  Mary Shimoyama,et al.  Successful Application of Whole Genome Sequencing in a Medical Genetics Clinic , 2016, Journal of Pediatric Genetics.

[11]  Borko Furht,et al.  Cloud Computing Fundamentals , 2010, Handbook of Cloud Computing.

[12]  Murat Kantarcioglu,et al.  Controlling the signal: Practical privacy protection of genomic data sharing through Beacon services , 2017, BMC Medical Genomics.

[13]  S. Nelson,et al.  Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays , 2008, PLoS genetics.

[14]  Emiliano De Cristofaro,et al.  Countering GATTACA: efficient and secure testing of fully-sequenced human genomes , 2011, CCS '11.

[15]  R. Ostrovsky,et al.  Identifying genetic relatives without compromising privacy , 2014, Genome research.

[16]  Orion J. Buske,et al.  The Matchmaker Exchange: A Platform for Rare Disease Gene Discovery , 2015, Human mutation.

[17]  Eun Yong Kang,et al.  Identification of individuals by trait prediction using whole-genome sequencing data , 2017, Proceedings of the National Academy of Sciences.

[18]  Ramakrishnan Srikant,et al.  Order preserving encryption for numeric data , 2004, SIGMOD '04.

[19]  J. Hubaux,et al.  Privacy-preserving genomic testing in the clinic: a model using HIV treatment , 2016, Genetics in Medicine.

[20]  Ian Goldberg,et al.  SoK: Making Sense of Censorship Resistance Systems , 2016, Proc. Priv. Enhancing Technol..

[21]  Thomas Steinke,et al.  Robust Traceability from Trace Amounts , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[22]  Thomas Ristenpart,et al.  Honey Encryption: Security Beyond the Brute-Force Bound , 2014, IACR Cryptol. ePrint Arch..

[23]  Latanya Sweeney,et al.  Identifying Participants in the Personal Genome Project by Name , 2013, ArXiv.

[24]  Yaniv Erlich,et al.  Routes for breaching and protecting genetic privacy , 2013 .

[25]  Carl A. Gunter,et al.  Privacy in the Genomic Era , 2014, ACM Comput. Surv..

[26]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.

[27]  Thomas Berg,et al.  Simeprevir increases rate of sustained virologic response among treatment-experienced patients with HCV genotype-1 infection: a phase IIb trial. , 2014, Gastroenterology.

[28]  Adam D. Smith,et al.  Discovering frequent patterns in sensitive data , 2010, KDD.

[29]  Gunnar Rätsch,et al.  Efficient privacy-preserving string search and an application in genomics , 2015, bioRxiv.

[30]  Takeshi Koshiba,et al.  Secure pattern matching using somewhat homomorphic encryption , 2013, CCSW.

[31]  Xiaoqian Jiang,et al.  FORESEE: Fully Outsourced secuRe gEnome Study basEd on homomorphic Encryption , 2015, BMC Medical Informatics and Decision Making.

[32]  Mete Akgün,et al.  Privacy preserving processing of genomic data: A survey , 2015, J. Biomed. Informatics.

[33]  Yuchen Zhang,et al.  HEALER: homomorphic computation of ExAct Logistic rEgRession for secure rare disease variants analysis in GWAS , 2015, Bioinform..

[34]  Stephen E. Fienberg,et al.  Privacy-Preserving Data Sharing for Genome-Wide Association Studies , 2012, J. Priv. Confidentiality.

[35]  J. McPherson,et al.  Coming of age: ten years of next-generation sequencing technologies , 2016, Nature Reviews Genetics.

[36]  Jean-Pierre Hubaux,et al.  Reconciling Utility with Privacy in Genomics , 2014, WPES.

[37]  Jean-Pierre Hubaux,et al.  Protecting and evaluating genomic privacy in medical tests and personalized medicine , 2013, WPES.

[38]  James H Fowler,et al.  Correlated genotypes in friendship networks , 2011, Proceedings of the National Academy of Sciences.

[39]  Jean-Pierre Hubaux,et al.  Addressing the concerns of the lacks family: quantification of kin genomic privacy , 2013, CCS.

[40]  Raymond Heatherly,et al.  SecureMA: protecting participant privacy in genetic association meta-analysis , 2014, Bioinform..

[41]  Carl A. Gunter,et al.  Controlled Functional Encryption , 2014, CCS.

[42]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[43]  Ittai Anati,et al.  Innovative Technology for CPU Based Attestation and Sealing , 2013 .

[44]  N. Cox,et al.  On sharing quantitative trait GWAS results in an era of multiple-omics data and the limits of genomic privacy. , 2012, American journal of human genetics.

[45]  Rafail Ostrovsky,et al.  Software protection and simulation on oblivious RAMs , 1996, JACM.

[46]  Stefan Katzenbeisser,et al.  Genomic Privacy (Dagstuhl Seminar 15431) , 2015, Dagstuhl Reports.

[47]  Shuang Wang,et al.  Genome privacy: challenges, technical approaches to mitigate risk, and ethical considerations in the United States , 2017, Annals of the New York Academy of Sciences.

[48]  P. Gács,et al.  Algorithms , 1992 .

[49]  J. Ioannidis,et al.  Meta-analysis methods for genome-wide association studies and beyond , 2013, Nature Reviews Genetics.

[50]  Emiliano De Cristofaro,et al.  Fast and Private Computation of Cardinality of Set Intersection and Union , 2012, CANS.

[51]  Rafail Ostrovsky,et al.  Privacy preserving protocol for detecting genetic relatives using rare variants , 2014, Bioinform..

[52]  A. Philippakis,et al.  The "All of Us" Research Program. , 2019, The New England journal of medicine.

[53]  Gene Tsudik,et al.  Genomic Privacy (Dagstuhl Seminar 13412) , 2013, Dagstuhl Reports.

[54]  Michael Naehrig,et al.  Private Computation on Encrypted Genomic Data , 2014, LATINCRYPT.

[55]  Kurt Rohloff,et al.  An FPGA co-processor implementation of Homomorphic Encryption , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).

[56]  Suela Kodra Fuzzy extractors : How to generate strong keys from biometrics and other noisy data , 2015 .

[57]  Daniel Gianola,et al.  Predicting genetic predisposition in humans: the promise of whole-genome markers , 2010, Nature Reviews Genetics.

[58]  Ahmad-Reza Sadeghi,et al.  TinyGarble: Highly Compressed and Scalable Sequential Garbled Circuits , 2015, 2015 IEEE Symposium on Security and Privacy.

[59]  Isabel Wagner Genomic Privacy Metrics: A Systematic Comparison , 2015, 2015 IEEE Security and Privacy Workshops.

[60]  Vitaly Shmatikov,et al.  Privacy-preserving data exploration in genome-wide association studies , 2013, KDD.

[61]  Yihua Zhang,et al.  Secure distributed genome analysis for GWAS and sequence comparison computation , 2015, BMC Medical Informatics and Decision Making.

[62]  Eran Halperin,et al.  Identifying Personal Genomes by Surname Inference , 2013, Science.

[63]  Heidi Ledford,et al.  CRISPR, the disruptor , 2015, Nature.

[64]  Michael Naehrig,et al.  Private Predictive Analysis on Encrypted Medical Data , 2014, IACR Cryptol. ePrint Arch..

[65]  Srdjan Capkun,et al.  Software Grand Exposure: SGX Cache Attacks Are Practical , 2017, WOOT.

[66]  Zhicong Huang,et al.  A privacy-preserving solution for compressed storage and selective retrieval of genomic data , 2016, Genome research.

[67]  Jung Hee Cheon,et al.  Homomorphic Computation of Edit Distance , 2015, IACR Cryptol. ePrint Arch..

[68]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[69]  Marina Blanton,et al.  Secure Outsourcing of DNA Searching via Finite Automata , 2010, DBSec.

[70]  C. Bustamante,et al.  Privacy Risks from Genomic Data-Sharing Beacons , 2015, American journal of human genetics.

[71]  Somesh Jha,et al.  Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[72]  Xiaoqian Jiang,et al.  A community assessment of privacy preserving techniques for human genomes , 2014, BMC Medical Informatics and Decision Making.

[73]  Michael Backes,et al.  Membership Privacy in MicroRNA-based Studies , 2016, CCS.

[74]  Haixu Tang,et al.  Learning your identity and disease from research papers: information leaks in genome wide association study , 2009, CCS.

[75]  E. Ashley Towards precision medicine , 2016, Nature Reviews Genetics.

[76]  Massoud Hadian Dehkordi,et al.  Private and Efficient Query Processing on Outsourced Genomic Databases , 2017, IEEE Journal of Biomedical and Health Informatics.

[77]  Jean-Pierre Hubaux,et al.  Privacy Threats and Practical Solutions for Genetic Risk Tests , 2015, 2015 IEEE Security and Privacy Workshops.

[78]  Yehuda Lindell,et al.  Efficient Secure Two-Party Protocols: Techniques and Constructions , 2010 .

[79]  Dima Alhadidi,et al.  Secure and Efficient Multiparty Computation on Genomic Data , 2016, IDEAS.

[80]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[81]  Jihoon Kim,et al.  PRINCESS: Privacy‐protecting Rare disease International Network Collaboration via Encryption through Software guard extensionS , 2017, Bioinform..

[82]  Muin J. Khoury,et al.  Quantifying realistic sample size requirements for human genome epidemiology , 2008 .

[83]  Murat Kantarcioglu,et al.  Secure Management of Biomedical Data With Cryptographic Hardware , 2012, IEEE Transactions on Information Technology in Biomedicine.

[84]  Roksana Boreli,et al.  Secure Evaluation Protocol for Personalized Medicine , 2014, WPES.

[85]  Stefan Katzenbeisser,et al.  Privacy preserving error resilient dna searching through oblivious automata , 2007, CCS '07.

[86]  Stephen E. Fienberg,et al.  Scalable privacy-preserving data sharing methodology for genome-wide association studies , 2014, J. Biomed. Informatics.

[87]  Zhicong Huang,et al.  Differential Privacy with Bounded Priors: Reconciling Utility and Privacy in Genome-Wide Association Studies , 2015, CCS.

[88]  Xiaoqian Jiang,et al.  Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery , 2014, J. Am. Medical Informatics Assoc..

[89]  Dan Boneh,et al.  Deriving genomic diagnoses without revealing patient genomes , 2017, Science.

[90]  Bo Peng,et al.  Large-Scale Privacy-Preserving Mapping of Human Genomic Sequences on Hybrid Clouds , 2012, NDSS.

[91]  Michael T. Goodrich,et al.  The Mastermind Attack on Genomic Data , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[92]  Emiliano De Cristofaro,et al.  Practical Private Set Intersection Protocols with Linear Complexity , 2010, Financial Cryptography.

[93]  Xukai Zou,et al.  A Survey of Secure Multiparty Computation Protocols for Privacy Preserving Genetic Tests , 2016, 2016 IEEE First International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE).

[94]  Eyal Kushilevitz,et al.  Private information retrieval , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[95]  Zhicong Huang,et al.  GenoGuard: Protecting Genomic Data against Brute-Force Attacks , 2015, 2015 IEEE Symposium on Security and Privacy.

[96]  Stefan Katzenbeisser,et al.  Privacy-Preserving Whole Genome Sequence Processing through Proxy-Aided ORAM , 2014, WPES.

[97]  A. Yao,et al.  Fair exchange with a semi-trusted third party (extended abstract) , 1997, CCS '97.

[98]  Xintao Wu,et al.  An overview of human genetic privacy , 2017, Annals of the New York Academy of Sciences.

[99]  Xi Wang,et al.  Privacy preserving large scale DNA read-mapping in MapReduce framework using FPGAs , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[100]  Xiaoqian Jiang,et al.  Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks , 2017, J. Am. Medical Informatics Assoc..

[101]  Yehuda Lindell,et al.  More efficient oblivious transfer and extensions for faster secure computation , 2013, CCS.

[102]  Frank Stajano,et al.  The Quest to Replace Passwords: A Framework for Comparative Evaluation of Web Authentication Schemes , 2012, 2012 IEEE Symposium on Security and Privacy.

[103]  J. Butler,et al.  Short tandem repeat typing technologies used in human identity testing. , 2007, BioTechniques.

[104]  Emiliano De Cristofaro,et al.  Privacy-Preserving Genetic Relatedness Test , 2016, ArXiv.

[105]  Dan Bogdanov,et al.  A new way to protect privacy in large-scale genome-wide association studies , 2013, Bioinform..

[106]  Tien Yin Wong,et al.  Genome-wide association study identifies FCGR2A as a susceptibility locus for Kawasaki disease , 2011, Nature Genetics.

[107]  Murat Kantarcioglu,et al.  A Cryptographic Approach to Securely Share and Query Genomic Sequences , 2008, IEEE Transactions on Information Technology in Biomedicine.

[108]  Bonnie Berger,et al.  Realizing privacy preserving genome-wide association studies , 2016, Bioinform..

[109]  Adam Molyneaux,et al.  Privacy-Preserving Processing of Raw Genomic Data , 2013, DPM/SETOP.

[110]  Yan Huang,et al.  Efficient Genome-Wide, Privacy-Preserving Similar Patient Query based on Private Edit Distance , 2015, CCS.

[111]  Emiliano De Cristofaro,et al.  The Chills and Thrills of Whole Genome Sequencing , 2013, Computer.