Secure distributed genome analysis for GWAS and sequence comparison computation

BackgroundThe rapid increase in the availability and volume of genomic data makes significant advances in biomedical research possible, but sharing of genomic data poses challenges due to the highly sensitive nature of such data. To address the challenges, a competition for secure distributed processing of genomic data was organized by the iDASH research center.MethodsIn this work we propose techniques for securing computation with real-life genomic data for minor allele frequency and chi-squared statistics computation, as well as distance computation between two genomic sequences, as specified by the iDASH competition tasks. We put forward novel optimizations, including a generalization of a version of mergesort, which might be of independent interest.ResultsWe provide implementation results of our techniques based on secret sharing that demonstrate practicality of the suggested protocols and also report on performance improvements due to our optimization techniques.ConclusionsThis work describes our techniques, findings, and experimental results developed and obtained as part of iDASH 2015 research competition to secure real-life genomic computations and shows feasibility of securely computing with genomic data in practice.

[1]  A. Yao,et al.  Fair exchange with a semi-trusted third party (extended abstract) , 1997, CCS '97.

[2]  Tal Rabin,et al.  Simplified VSS and fast-track multiparty computations with applications to threshold cryptography , 1998, PODC '98.

[3]  Xiaoqian Jiang,et al.  Protecting genomic data analytics in the cloud: state of the art and opportunities , 2016, BMC Medical Genomics.

[4]  Ran Canetti,et al.  Security and Composition of Multiparty Cryptographic Protocols , 2000, Journal of Cryptology.

[5]  Marina Blanton,et al.  Private and oblivious set and multiset operations , 2012, ASIACCS '12.

[6]  Octavian Catrina,et al.  Improved Primitives for Secure Multiparty Integer Computation , 2010, SCN.

[7]  Dan Bogdanov,et al.  Sharemind: A Framework for Fast Privacy-Preserving Computations , 2008, ESORICS.

[8]  Yihua Zhang,et al.  PICCO: a general-purpose compiler for private distributed computation , 2013, CCS.

[9]  Benny Pinkas,et al.  Fairplay - Secure Two-Party Computation System , 2004, USENIX Security Symposium.

[10]  Chip Elliott,et al.  GENI - global environment for network innovations , 2008, LCN.

[11]  Benny Pinkas,et al.  Fairplay - Secure Two-Party Computation System (Awarded Best Student Paper!) , 2004 .

[12]  Yihua Zhang,et al.  Secure Computation on Floating Point Numbers , 2013, NDSS.

[13]  Octavian Catrina,et al.  Secure Computation with Fixed-Point Numbers , 2010, Financial Cryptography.

[14]  Emiliano De Cristofaro,et al.  Practical Private Set Intersection Protocols with Linear Complexity , 2010, Financial Cryptography.

[15]  Adi Shamir,et al.  How to share a secret , 1979, CACM.

[16]  Kenneth E. Batcher,et al.  Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.