BLOOM: BLoom filter based oblivious outsourced matchings

BackgroundWhole genome sequencing has become fast, accurate, and cheap, paving the way towards the large-scale collection and processing of human genome data. Unfortunately, this dawning genome era does not only promise tremendous advances in biomedical research but also causes unprecedented privacy risks for the many. Handling storage and processing of large genome datasets through cloud services greatly aggravates these concerns. Current research efforts thus investigate the use of strong cryptographic methods and protocols to implement privacy-preserving genomic computations.MethodsWe propose Fhe-Bloom and Phe-Bloom, two efficient approaches for genetic disease testing using homomorphically encrypted Bloom filters. Both approaches allow the data owner to securely outsource storage and computation to an untrusted cloud. Fhe-Bloom is fully secure in the semi-honest model while Phe-Bloom slightly relaxes security guarantees in a trade-off for highly improved performance.ResultsWe implement and evaluate both approaches on a large dataset of up to 50 patient genomes each with up to 1000000 variations (single nucleotide polymorphisms). For both implementations, overheads scale linearly in the number of patients and variations, while Phe-Bloom is faster by at least three orders of magnitude. For example, testing disease susceptibility of 50 patients with 100000 variations requires only a total of 308.31 s (σ=8.73 s) with our first approach and a mere 0.07 s (σ=0.00 s) with the second. We additionally discuss security guarantees of both approaches and their limitations as well as possible extensions towards more complex query types, e.g., fuzzy or range queries.ConclusionsBoth approaches handle practical problem sizes efficiently and are easily parallelized to scale with the elastic resources available in the cloud. The fully homomorphic scheme, Fhe-Bloom, realizes a comprehensive outsourcing to the cloud, while the partially homomorphic scheme, Phe-Bloom, trades a slight relaxation of security guarantees against performance improvements by at least three orders of magnitude.

[1]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[2]  Taher El Gamal A public key cryptosystem and a signature scheme based on discrete logarithms , 1984, IEEE Trans. Inf. Theory.

[3]  A. Yao,et al.  Fair exchange with a semi-trusted third party (extended abstract) , 1997, CCS '97.

[4]  Silvio Micali,et al.  How to play ANY mental game , 1987, STOC.

[5]  Hugo Krawczyk,et al.  HMAC: Keyed-Hashing for Message Authentication , 1997, RFC.

[6]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[7]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[8]  DNA identification and surveillance creep , 1999 .

[9]  Moti Yung,et al.  Non-interactive cryptocomputing for NC/sup 1/ , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[10]  Ueli Maurer,et al.  General Secure Multi-party Computation from any Linear Secret-Sharing Scheme , 2000, EUROCRYPT.

[11]  Ivan Damgård,et al.  A Generalisation, a Simplification and Some Applications of Paillier's Probabilistic Public-Key System , 2001, Public Key Cryptography.

[12]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[13]  Bradley Malin,et al.  Technical Evaluation: An Evaluation of the Current State of Genomic Data Privacy Protection Technology and a Roadmap for the Future , 2004, J. Am. Medical Informatics Assoc..

[14]  G. Church,et al.  The Personal Genome Project , 2005, Molecular systems biology.

[15]  Dan Boneh,et al.  Evaluating 2-DNF Formulas on Ciphertexts , 2005, TCC.

[16]  Turhan Canli,et al.  The emergence of genomic psychology , 2007, EMBO reports.

[17]  Michael Miller,et al.  Multiple splice defects in ABCA1 cause low HDL-C in a family with Hypoalphalipoproteinemia and premature coronary disease , 2009, BMC Medical Genetics.

[18]  Andrew D. Johnson,et al.  Bmc Medical Genetics an Open Access Database of Genome-wide Association Results , 2009 .

[19]  S. Nelson,et al.  Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays , 2008, PLoS genetics.

[20]  M. Daly,et al.  Genetic Mapping in Human Disease , 2008, Science.

[21]  Haixu Tang,et al.  Learning your identity and disease from research papers: information leaks in genome wide association study , 2009, CCS.

[22]  Craig Gentry,et al.  A fully homomorphic encryption scheme , 2009 .

[23]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[24]  Craig Gentry,et al.  Fully Homomorphic Encryption over the Integers , 2010, EUROCRYPT.

[25]  Ahmad-Reza Sadeghi,et al.  From Dust to Dawn: Practically Efficient Two-Party Secure Function Evaluation Protocols and their Modular Design , 2010, IACR Cryptol. ePrint Arch..

[26]  Mikhail J. Atallah,et al.  Securely outsourcing linear algebra computations , 2010, ASIACCS '10.

[27]  Xenofontas A. Dimitropoulos,et al.  SEPIA: Privacy-Preserving Aggregation of Multi-Domain Network Events and Statistics , 2010, USENIX Security Symposium.

[28]  Craig Gentry,et al.  Fully Homomorphic Encryption without Bootstrapping , 2011, IACR Cryptol. ePrint Arch..

[29]  Vinod Vaikuntanathan,et al.  Efficient Fully Homomorphic Encryption from (Standard) LWE , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[30]  Craig Gentry,et al.  Fully Homomorphic Encryption with Polylog Overhead , 2012, EUROCRYPT.

[31]  Stefan Katzenbeisser,et al.  Towards Secure Bioinformatics Services (Short Paper) , 2011, Financial Cryptography.

[32]  Craig Gentry,et al.  (Leveled) fully homomorphic encryption without bootstrapping , 2012, ITCS '12.

[33]  Emiliano De Cristofaro,et al.  Genodroid: are privacy-preserving genomic tests ready for prime time? , 2012, WPES '12.

[34]  Florian Kerschbaum,et al.  Outsourced private set intersection using homomorphic encryption , 2012, ASIACCS '12.

[35]  Emiliano De Cristofaro,et al.  Experimenting with Fast Private Set Intersection , 2012, TRUST.

[36]  Craig Gentry,et al.  Packed Ciphertexts in LWE-Based Homomorphic Encryption , 2013, Public Key Cryptography.

[37]  Emiliano De Cristofaro,et al.  The Chills and Thrills of Whole Genome Sequencing , 2013, Computer.

[38]  Latanya Sweeney,et al.  Identifying Participants in the Personal Genome Project by Name , 2013, ArXiv.

[39]  Donald Kossmann,et al.  Adaptive Range Filters for Cold Data: Avoiding Trips to Siberia , 2013, Proc. VLDB Endow..

[40]  Patrick Traynor,et al.  Secure outsourced garbled circuit evaluation for mobile devices , 2013, J. Comput. Secur..

[41]  Jean-Pierre Hubaux,et al.  Addressing the concerns of the lacks family: quantification of kin genomic privacy , 2013, CCS.

[42]  Dan Bogdanov,et al.  A new way to protect privacy in large-scale genome-wide association studies , 2013, Bioinform..

[43]  Emiliano De Cristofaro,et al.  Do I know you?: efficient and privacy-preserving common friend-finder protocols and applications , 2013, ACSAC.

[44]  Yihua Zhang,et al.  PICCO: a general-purpose compiler for private distributed computation , 2013, CCS.

[45]  Jean-Pierre Hubaux,et al.  Privacy-Preserving Computation of Disease Risk by Using Genomic, Clinical, and Environmental Data , 2013, HealthTech.

[46]  Vinod Vaikuntanathan,et al.  Efficient Fully Homomorphic Encryption from (Standard) LWE , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[47]  Frederik Vercauteren,et al.  Fully homomorphic SIMD operations , 2012, Designs, Codes and Cryptography.

[48]  Mariana Raykova,et al.  Scaling Private Set Intersection to Billion-Element Sets , 2014, Financial Cryptography.

[49]  Yaniv Erlich,et al.  Routes for breaching and protecting genetic privacy , 2013, Nature Reviews Genetics.

[50]  Emiliano De Cristofaro,et al.  Simpler protocols for privacy-preserving disease susceptibility testing , 2014, PETS 2014.

[51]  Shai Halevi,et al.  Algorithms in HElib , 2014, CRYPTO.

[52]  Xiaoqian Jiang,et al.  A community assessment of privacy preserving techniques for human genomes , 2014, BMC Medical Informatics and Decision Making.

[53]  Michael Zohner,et al.  ABY - A Framework for Efficient Mixed-Protocol Secure Two-Party Computation , 2015, NDSS.

[54]  Xiaoqian Jiang,et al.  FORESEE: Fully Outsourced secuRe gEnome Study basEd on homomorphic Encryption , 2015, BMC Medical Informatics and Decision Making.

[55]  Jun Sakuma,et al.  Privacy-preserving genome-wide association studies on cloud environment using fully homomorphic encryption , 2015, BMC Medical Informatics and Decision Making.

[56]  Ratna Dutta,et al.  Secure and Efficient Private Set Intersection Cardinality Using Bloom Filter , 2015, ISC.

[57]  Kristin E. Lauter,et al.  Private genome analysis through homomorphic encryption , 2015, BMC Medical Informatics and Decision Making.

[58]  Benny Pinkas,et al.  Phasing: Private Set Intersection Using Permutation-based Hashing , 2015, USENIX Security Symposium.

[59]  Yihua Zhang,et al.  Secure distributed genome analysis for GWAS and sequence comparison computation , 2015, BMC Medical Informatics and Decision Making.

[60]  Hanno Wirtz,et al.  Bandwidth-Optimized Secure Two-Party Computation of Minima , 2015, CANS.

[61]  Klaus Wehrle,et al.  Choose Wisely: A Comparison of Secure Two-Party Computation Frameworks , 2015, 2015 IEEE Security and Privacy Workshops.

[62]  Carl A. Gunter,et al.  Privacy in the Genomic Era , 2014, ACM Comput. Surv..

[63]  Xiaoqian Jiang,et al.  Privacy-preserving GWAS analysis on federated genomic datasets , 2015, BMC Medical Informatics and Decision Making.

[64]  F. Collins,et al.  A new initiative on precision medicine. , 2015, The New England journal of medicine.

[65]  C. Bustamante,et al.  Privacy Risks from Genomic Data-Sharing Beacons , 2015, American journal of human genetics.

[66]  Klaus Wehrle,et al.  CPPL: Compact Privacy Policy Language , 2016, WPES@CCS.

[67]  Klaus Wehrle,et al.  Moving Privacy-Sensitive Services from Public Clouds to Decentralized Private Clouds , 2016, 2016 IEEE International Conference on Cloud Engineering Workshop (IC2EW).

[68]  Xiaoqian Jiang,et al.  Protecting genomic data analytics in the cloud: state of the art and opportunities , 2016, BMC Medical Genomics.

[69]  Klaus Wehrle,et al.  TraceMixer: Privacy-preserving crowd-sensing sans trusted third party , 2017, 2017 13th Annual Conference on Wireless On-demand Network Systems and Services (WONS).

[70]  Jan Rüth,et al.  Privacy-Preserving HMM Forward Computation , 2017, CODASPY.