My Genome Belongs to Me: Controlling Third Party Computation on Genomic Data

Abstract An individual’s genetic information is possibly the most valuable personal information. While knowledge of a person’s DNA sequence can facilitate the diagnosis of several heritable diseases and allow personalized treatment, its exposure comes with significant threats to the patient’s privacy. Currently known solutions for privacy-respecting computation require the owner of the DNA to either be heavily involved in the execution of a cryptographic protocol or to completely outsource the access control to a third party. This motivates the demand for cryptographic protocols which enable computation over encrypted genomic data while keeping the owner of the genome in full control. We envision a scenario where data owners can exercise arbitrary and dynamic access policies, depending on the intended use of the analysis results and on the credentials of who is conducting the analysis. At the same time, data owners are not required to maintain a local copy of their entire genetic data and do not need to exhaust their computational resources in an expensive cryptographic protocol. In this work, we present METIS, a system that assists the computation over encrypted data stored in the cloud while leaving the decision on admissible computations to the data owner. It is based on garbled circuits and supports any polynomially-computable function. A critical feature of our system is that the data owner is free from computational overload and her communication complexity is independent of the size of the input data and only linear in the size of the circuit’s output. We demonstrate the practicality of our approach with an implementation and an evaluation of several functions over real datasets.

[1]  Emiliano De Cristofaro,et al.  Countering GATTACA: efficient and secure testing of fully-sequenced human genomes , 2011, CCS '11.

[2]  Yan Huang,et al.  Efficient Genome-Wide, Privacy-Preserving Similar Patient Query based on Private Edit Distance , 2015, CCS.

[3]  Ran Canetti,et al.  The random oracle methodology, revisited , 2000, JACM.

[4]  Silvio Micali,et al.  How to construct random functions , 1986, JACM.

[5]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[6]  Moni Naor,et al.  Privacy preserving auctions and mechanism design , 1999, EC '99.

[7]  Jean-Pierre Hubaux,et al.  Privacy-Preserving Computation of Disease Risk by Using Genomic, Clinical, and Environmental Data , 2013, HealthTech.

[8]  Boris Pasche,et al.  Whole-genome sequencing: a step closer to personalized medicine. , 2011, JAMA.

[9]  Jonathan Katz,et al.  Efficient Secure Two-Party Computation Using Symmetric Cut-and-Choose , 2013, CRYPTO.

[10]  Brent Waters,et al.  A Framework for Efficient and Composable Oblivious Transfer , 2008, CRYPTO.

[11]  Laurie D. Smith,et al.  A 26-hour system of highly sensitive whole genome sequencing for emergency management of genetic diseases , 2015, Genome Medicine.

[12]  Stefan Katzenbeisser,et al.  Privacy-Preserving Whole Genome Sequence Processing through Proxy-Aided ORAM , 2014, WPES.

[13]  Patrick Traynor,et al.  Whitewash: outsourcing garbled circuit generation for mobile devices , 2014, ACSAC.

[14]  Craig Gentry,et al.  i-Hop Homomorphic Encryption and Rerandomizable Yao Circuits , 2010, IACR Cryptol. ePrint Arch..

[15]  Yuval Ishai,et al.  Extending Oblivious Transfers Efficiently , 2003, CRYPTO.

[16]  Kim Laine,et al.  Secure Data Exchange: A Marketplace in the Cloud , 2016, IACR Cryptol. ePrint Arch..

[17]  Abhi Shelat,et al.  Towards Billion-Gate Secure Computation with Malicious Adversaries , 2012, IACR Cryptol. ePrint Arch..

[18]  David Evans,et al.  Obliv-C: A Language for Extensible Data-Oblivious Computation , 2015, IACR Cryptol. ePrint Arch..

[19]  Mariana Raykova,et al.  Outsourcing Multi-Party Computation , 2011, IACR Cryptol. ePrint Arch..

[20]  Jonathan Katz,et al.  Faster Secure Two-Party Computation Using Garbled Circuits , 2011, USENIX Security Symposium.

[21]  Yehuda Lindell Fast Cut-and-Choose Based Protocols for Malicious and Covert Adversaries , 2013, CRYPTO.

[22]  Jonathan Katz,et al.  On the Security of the Free-XOR Technique , 2012, IACR Cryptol. ePrint Arch..

[23]  Vladimir Kolesnikov,et al.  Improved Garbled Circuit: Free XOR Gates and Applications , 2008, ICALP.

[24]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2000, Journal of Cryptology.

[25]  L. Staudt,et al.  The NCI Genomic Data Commons as an engine for precision medicine. , 2017, Blood.

[26]  Mihir Bellare,et al.  Random oracles are practical: a paradigm for designing efficient protocols , 1993, CCS '93.

[27]  L. Jorde,et al.  Genetic variation, classification and 'race' , 2004, Nature Genetics.

[28]  Vladimir Kolesnikov,et al.  FleXOR: Flexible garbling for XOR gates that beats free-XOR , 2014, IACR Cryptol. ePrint Arch..

[29]  Murat Kantarcioglu,et al.  A Cryptographic Approach to Securely Share and Query Genomic Sequences , 2008, IEEE Transactions on Information Technology in Biomedicine.

[30]  Larry Carter,et al.  Universal Classes of Hash Functions , 1979, J. Comput. Syst. Sci..

[31]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[32]  Moni Naor,et al.  Oblivious transfer and polynomial evaluation , 1999, STOC '99.

[33]  Mihir Bellare,et al.  Adaptively Secure Garbling with Applications to One-Time Programs and Secure Outsourcing , 2012, ASIACRYPT.

[34]  Ben Riva,et al.  Salus: a system for server-aided secure function evaluation , 2012, CCS.

[35]  Carl A. Gunter,et al.  Controlled Functional Encryption , 2014, CCS.

[36]  Kristin E. Lauter,et al.  Private genome analysis through homomorphic encryption , 2015, BMC Medical Informatics and Decision Making.

[37]  Benny Pinkas,et al.  Fairplay - Secure Two-Party Computation System , 2004, USENIX Security Symposium.

[38]  Kelly Edwards,et al.  From patients to partners: participant-centric initiatives in biomedical research , 2012, Nature Reviews Genetics.

[39]  Jean-Pierre Hubaux,et al.  Privacy-Enhancing Technologies for Medical Tests Using Genomic Data , 2013, NDSS.

[40]  Peter M. Rice,et al.  The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants , 2009, Nucleic acids research.

[41]  Yehuda Lindell,et al.  More efficient oblivious transfer and extensions for faster secure computation , 2013, CCS.

[42]  Patrick Traynor,et al.  Secure outsourced garbled circuit evaluation for mobile devices , 2013, J. Comput. Secur..

[43]  Silvio Micali,et al.  The round complexity of secure protocols , 1990, STOC '90.

[44]  Mihir Bellare,et al.  Foundations of garbled circuits , 2012, CCS.

[45]  Vitaly Shmatikov,et al.  Towards Practical Privacy for Genomic Computation , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[46]  L. Brooks,et al.  A DNA polymorphism discovery resource for research on human genetic variation. , 1998, Genome research.

[47]  Claudio Orlandi,et al.  A Framework for Outsourcing of Secure Computation , 2014, CCSW.

[48]  Keith B. Frikken Practical Private DNA String Searching and Matching through Efficient Oblivious Automata Evaluation , 2009, DBSec.

[49]  Ronald Cramer,et al.  Universal Hash Proofs and a Paradigm for Adaptive Chosen Ciphertext Secure Public-Key Encryption , 2001, EUROCRYPT.

[50]  Joan Feigenbaum,et al.  Reuse It Or Lose It: More Efficient Secure Computation Through Reuse of Encrypted Values , 2014, CCS.

[51]  M. Kalia,et al.  Personalized oncology: recent advances and future challenges. , 2013, Metabolism: clinical and experimental.

[52]  M. Watson,et al.  Illuminating the future of DNA sequencing , 2014, Genome Biology.

[53]  Abhi Shelat,et al.  PCF: A Portable Circuit Format for Scalable Two-Party Secure Computation , 2013, USENIX Security Symposium.

[54]  Andrew Chi-Chih Yao,et al.  How to generate and exchange secrets , 1986, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[55]  Joan Scott,et al.  Public opinion about the importance of privacy in biobank research. , 2009, American journal of human genetics.

[56]  Moni Naor,et al.  Number-theoretic constructions of efficient pseudo-random functions , 2004, JACM.

[57]  Jonathan Katz,et al.  Quid-Pro-Quo-tocols: Strengthening Semi-honest Protocols with Dual Execution , 2012, 2012 IEEE Symposium on Security and Privacy.

[58]  Stefan Katzenbeisser,et al.  Privacy preserving error resilient dna searching through oblivious automata , 2007, CCS '07.

[59]  Yehuda Lindell,et al.  A Proof of Security of Yao’s Protocol for Two-Party Computation , 2009, Journal of Cryptology.

[60]  Emiliano De Cristofaro,et al.  Whole Genome Sequencing: Revolutionary Medicine or Privacy Nightmare? , 2015, Computer.