Private genome analysis through homomorphic encryption

BackgroundThe rapid development of genome sequencing technology allows researchers to access large genome datasets. However, outsourcing the data processing o the cloud poses high risks for personal privacy. The aim of this paper is to give a practical solution for this problem using homomorphic encryption. In our approach, all the computations can be performed in an untrusted cloud without requiring the decryption key or any interaction with the data owner, which preserves the privacy of genome data.MethodsWe present evaluation algorithms for secure computation of the minor allele frequencies and χ2 statistic in a genome-wide association studies setting. We also describe how to privately compute the Hamming distance and approximate Edit distance between encrypted DNA sequences. Finally, we compare performance details of using two practical homomorphic encryption schemes - the BGV scheme by Gentry, Halevi and Smart and the YASHE scheme by Bos, Lauter, Loftus and Naehrig.ResultsThe approach with the YASHE scheme analyzes data from 400 people within about 2 seconds and picks a variant associated with disease from 311 spots. For another task, using the BGV scheme, it took about 65 seconds to securely compute the approximate Edit distance for DNA sequences of size 5K and figure out the differences between them.ConclusionsThe performance numbers for BGV are better than YASHE when homomorphically evaluating deep circuits (like the Hamming distance algorithm or approximate Edit distance algorithm). On the other hand, it is more efficient to use the YASHE scheme for a low-degree computation, such as minor allele frequencies or χ2 test statistic in a case-control study.

[1]  H. Nussbaumer,et al.  Fast polynomial transform algorithms for digital convolution , 1980 .

[2]  Michael Naehrig,et al.  Improved Security for a Ring-Based Fully Homomorphic Encryption Scheme , 2013, IMACC.

[3]  Murat Kantarcioglu,et al.  A Cryptographic Approach to Securely Share and Query Genomic Sequences , 2008, IEEE Transactions on Information Technology in Biomedicine.

[4]  Michael Naehrig,et al.  ML Confidential: Machine Learning on Encrypted Data , 2012, ICISC.

[5]  Brent Waters,et al.  Homomorphic Encryption from Learning with Errors: Conceptually-Simpler, Asymptotically-Faster, Attribute-Based , 2013, CRYPTO.

[6]  Jean-Pierre Hubaux,et al.  Addressing the concerns of the lacks family: quantification of kin genomic privacy , 2013, CCS.

[7]  Chris Peikert,et al.  On Ideal Lattices and Learning with Errors over Rings , 2010, JACM.

[8]  Michael Naehrig,et al.  A Comparison of the Homomorphic Encryption Schemes FV and YASHE , 2014, AFRICACRYPT.

[9]  Vinod Vaikuntanathan,et al.  On-the-fly multiparty computation on the cloud via multikey fully homomorphic encryption , 2012, STOC '12.

[10]  Carl A. Gunter,et al.  Privacy in the Genomic Era , 2014, ACM Comput. Surv..

[11]  Chris Peikert,et al.  Better Key Sizes (and Attacks) for LWE-Based Encryption , 2011, CT-RSA.

[12]  Craig Gentry,et al.  Fully Homomorphic Encryption over the Integers , 2010, EUROCRYPT.

[13]  Emiliano De Cristofaro,et al.  Countering GATTACA: efficient and secure testing of fully-sequenced human genomes , 2011, CCS '11.

[14]  Jung Hee Cheon,et al.  Search-and-compute on Encrypted Data , 2015, IACR Cryptol. ePrint Arch..

[15]  Yaniv Erlich,et al.  Routes for breaching and protecting genetic privacy , 2013, Nature Reviews Genetics.

[16]  Jean-Pierre Hubaux,et al.  Privacy-Preserving Computation of Disease Risk by Using Genomic, Clinical, and Environmental Data , 2013, HealthTech.

[17]  S. Halevi,et al.  Design and Implementation of a Homomorphic-Encryption Library , 2012 .

[18]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[19]  Michael Naehrig,et al.  Private Computation on Encrypted Genomic Data , 2014, LATINCRYPT.

[20]  Ron Steinfeld,et al.  Making NTRU as Secure as Worst-Case Problems over Ideal Lattices , 2011, EUROCRYPT.

[21]  Zvika Brakerski,et al.  Fully Homomorphic Encryption without Modulus Switching from Classical GapSVP , 2012, CRYPTO.

[22]  L. Bluestein A linear filtering approach to the computation of discrete Fourier transform , 1970 .

[23]  Michael Naehrig,et al.  Private Predictive Analysis on Encrypted Medical Data , 2014, IACR Cryptol. ePrint Arch..

[24]  Frederik Vercauteren,et al.  Fully homomorphic SIMD operations , 2012, Designs, Codes and Cryptography.

[25]  Jung Hee Cheon,et al.  Homomorphic Computation of Edit Distance , 2015, IACR Cryptol. ePrint Arch..

[26]  Vinod Vaikuntanathan,et al.  Can homomorphic encryption be practical? , 2011, CCSW '11.

[27]  G. Church,et al.  The Personal Genome Project , 2005, Molecular systems biology.

[28]  Takeshi Koshiba,et al.  Secure pattern matching using somewhat homomorphic encryption , 2013, CCSW.

[29]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[30]  Craig Gentry,et al.  Homomorphic Evaluation of the AES Circuit , 2012, IACR Cryptol. ePrint Arch..

[31]  Craig Gentry,et al.  (Leveled) fully homomorphic encryption without bootstrapping , 2012, ITCS '12.