Secure Management of Biomedical Data With Cryptographic Hardware

The biomedical community is increasingly migrating toward research endeavors that are dependent on large quantities of genomic and clinical data. At the same time, various regulations require that such data be shared beyond the initial collecting organization (e.g., an academic medical center). It is of critical importance to ensure that when such data are shared, as well as managed, it is done so in a manner that upholds the privacy of the corresponding individuals and the overall security of the system. In general, organizations have attempted to achieve these goals through deidentification methods that remove explicitly, and potentially, identifying features (e.g., names, dates, and geocodes). However, a growing number of studies demonstrate that deidentified data can be reidentified to named individuals using simple automated methods. As an alternative, it was shown that biomedical data could be shared, managed, and analyzed through practical cryptographic protocols without revealing the contents of any particular record. Yet, such protocols required the inclusion of multiple third parties, which may not always be feasible in the context of trust or bandwidth constraints. Thus, in this paper, we introduce a framework that removes the need for multiple third parties by collocating services to store and to process sensitive biomedical data through the integration of cryptographic hardware. Within this framework, we define a secure protocol to process genomic data and perform a series of experiments to demonstrate that such an approach can be run in an efficient manner for typical biomedical investigations.

[1]  Bradley Malin,et al.  Evaluating re-identification risks with respect to the HIPAA privacy rule , 2010, J. Am. Medical Informatics Assoc..

[2]  Murat Kantarcioglu,et al.  A Cryptographic Approach to Securely Share and Query Genomic Sequences , 2008, IEEE Transactions on Information Technology in Biomedicine.

[3]  Muin J. Khoury,et al.  Quantifying realistic sample size requirements for human genome epidemiology , 2008 .

[4]  Murat Kantarcioglu,et al.  Design and Analysis of Querying Encrypted Data in Relational Databases , 2007, DBSec.

[5]  Radu Sion,et al.  Joining Privately on Outsourced Data , 2010, Secure Data Management.

[6]  Dirk Fox,et al.  Advanced Encryption Standard (AES) , 1999, Datenschutz und Datensicherheit.

[7]  Bernard P. Zajac Applied cryptography: Protocols, algorithms, and source code in C , 1994 .

[8]  Sushil Jajodia,et al.  Balancing confidentiality and efficiency in untrusted relational DBMSs , 2003, CCS '03.

[9]  Benny Pinkas,et al.  Oblivious RAM Revisited , 2010, CRYPTO.

[10]  Hakan Hacigümüs,et al.  Executing SQL over encrypted data in the database-service-provider model , 2002, SIGMOD '02.

[11]  Rajeev Motwani,et al.  Two Can Keep A Secret: A Distributed Architecture for Secure Database Services , 2005, CIDR.

[12]  Gene Tsudik,et al.  A Privacy-Preserving Index for Range Queries , 2004, VLDB.

[13]  Kenneth A. Goldman,et al.  Matchbox: secure data sharing , 2004, IEEE Internet Computing.

[14]  Ramakrishnan Srikant,et al.  Order preserving encryption for numeric data , 2004, SIGMOD '04.

[15]  Teri A Manolio,et al.  Collaborative genome-wide association studies of diverse diseases: programs of the NHGRI's office of population genomics. , 2009, Pharmacogenomics.

[16]  Murat Kantarcioglu,et al.  Query Optimization in Encrypted Relational Databases by Vertical Schema Partitioning , 2009, Secure Data Management.

[17]  Gene Tsudik,et al.  A Framework for Efficient Storage Security in RDBMS , 2004, EDBT.

[18]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, ACM Comput. Surv..

[19]  Ehud Gudes,et al.  A Structure Preserving Database Encryption Scheme , 2004, Secure Data Management.

[20]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[21]  Marylyn D. Ritchie,et al.  PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations , 2010, Bioinform..

[22]  Paul Elliott,et al.  The UK Biobank sample handling and storage validation studies. , 2008, International journal of epidemiology.

[23]  Murat Kantarcioglu,et al.  Building disclosure risk aware query optimizers for relational databases , 2010, Proc. VLDB Endow..

[24]  Russ B. Altman,et al.  A call for the creation of personalized medicine databases , 2006, Nature Reviews Drug Discovery.

[25]  Muin J Khoury,et al.  Comparative effectiveness research and genomic medicine: An evolving partnership for 21st century medicine , 2009, Genetics in Medicine.

[26]  M. Guyer,et al.  Charting a course for genomic medicine from base pairs to bedside , 2011, Nature.

[27]  K. Sirotkin,et al.  The NCBI dbGaP database of genotypes and phenotypes , 2007, Nature Genetics.

[28]  Sean W. Smith,et al.  Small, stupid, and scalable: secure computing with faerieplay , 2010, STC '10.

[29]  Trent Jaeger,et al.  Secure coprocessor-based intrusion detection , 2002, EW 10.

[30]  Johann-Christoph Freytag,et al.  Almost Optimal Private Information Retrieval , 2002, Privacy Enhancing Technologies.

[31]  Sean W. Smith,et al.  Private Information Storage with Logarithm-Space Secure Hardware , 2004, International Information Security Workshops.

[32]  Ben Adida,et al.  GenePING: secure, scalable management of personal genomic data , 2006, BMC Genomics.

[33]  Murat Kantarcioglu,et al.  Sovereign Joins , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[34]  Siani Pearson,et al.  Towards accountable management of identity and privacy: sticky policies and enforceable tracing services , 2003, 14th International Workshop on Database and Expert Systems Applications, 2003. Proceedings..

[35]  Bradley Malin,et al.  Technical Evaluation: An Evaluation of the Current State of Genomic Data Privacy Protection Technology and a Roadmap for the Future , 2004, J. Am. Medical Informatics Assoc..