Private Predictive Analysis on Encrypted Medical Data

Increasingly, confidential medical records are being stored in data centers hosted by hospitals or large companies. As sophisticated algorithms for predictive analysis on medical data continue to be developed, it is likely that, in the future, more and more computation will be done on private patient data. While encryption provides a tool for assuring the privacy of medical information, it limits the functionality for operating on such data. Conventional encryption methods used today provide only very restricted possibilities or none at all to operate on encrypted data without decrypting it first. Homomorphic encryption provides a tool for handling such computations on encrypted data, without decrypting the data, and without even needing the decryption key. In this paper, we discuss possible application scenarios for homomorphic encryption in order to ensure privacy of sensitive medical data. We describe how to privately conduct predictive analysis tasks on encrypted data using homomorphic encryption. As a proof of concept, we present a working implementation of a prediction service running in the cloud (hosted on Microsoft's Windows Azure), which takes as input private encrypted health data, and returns the probability for suffering cardiovascular disease in encrypted form. Since the cloud service uses homomorphic encryption, it makes this prediction while handling only encrypted data, learning nothing about the submitted confidential medical data.

[1]  A. Wear CIRCULATION , 1964, The Lancet.

[2]  W. Copes,et al.  Evaluating trauma care: the TRISS method. Trauma Score and the Injury Severity Score. , 1987, The Journal of trauma.

[3]  Michael Naehrig,et al.  A Comparison of the Homomorphic Encryption Schemes FV and YASHE , 2014, AFRICACRYPT.

[4]  Vinod Vaikuntanathan,et al.  On-the-fly multiparty computation on the cloud via multikey fully homomorphic encryption , 2012, STOC '12.

[5]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[6]  Craig Gentry,et al.  Pinocchio: Nearly Practical Verifiable Computation , 2013, IEEE Symposium on Security and Privacy.

[7]  Reihaneh Safavi-Naini,et al.  Privacy preserving EHR system using attribute-based infrastructure , 2010, CCSW '10.

[8]  Ron Steinfeld,et al.  Making NTRU as Secure as Worst-Case Problems over Ideal Lattices , 2011, EUROCRYPT.

[9]  A. V. Sriharsha,et al.  On Syntactic Anonymity and Differential Privacy , 2015 .

[10]  Vinod Vaikuntanathan,et al.  Fully Homomorphic Encryption from Ring-LWE and Security for Key Dependent Messages , 2011, CRYPTO.

[11]  Eric Horvitz,et al.  Patient controlled encryption: ensuring privacy of electronic medical records , 2009, CCSW '09.

[12]  H. Nussbaumer Fast Fourier transform and convolution algorithms , 1981 .

[13]  H A Pincus,et al.  Large medical databases, population-based research, and patient confidentiality. , 2000, The American journal of psychiatry.

[14]  Kristin L. Sainani,et al.  Logistic Regression , 2014, PM & R : the journal of injury, function, and rehabilitation.

[15]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[16]  Arnold Schönhage,et al.  Schnelle Multiplikation großer Zahlen , 1971, Computing.

[17]  Jing Xie,et al.  Framingham Stroke Risk Profile and poor cognitive function: a population-based study , 2008, BMC neurology.

[18]  Donald Ervin Knuth,et al.  The Art of Computer Programming, Volume II: Seminumerical Algorithms , 1970 .

[19]  Zvika Brakerski,et al.  Fully Homomorphic Encryption without Modulus Switching from Classical GapSVP , 2012, CRYPTO.

[20]  Michael Pine,et al.  Female Gender Is an Independent Predictor of Operative Mortality After Coronary Artery Bypass Graft Surgery: Contemporary Analysis of 31 Midwestern Hospitals , 2005, Circulation.

[21]  Craig Gentry,et al.  Fully Homomorphic Encryption without Bootstrapping , 2011, IACR Cryptol. ePrint Arch..

[22]  Michael J Pencina,et al.  Cardiovascular Disease Risk Assessment: Insights from Framingham. , 2013, Global heart.

[23]  Craig Gentry,et al.  Non-interactive Verifiable Computing: Outsourcing Computation to Untrusted Workers , 2010, CRYPTO.

[24]  P J Talmud,et al.  Cholesteryl Ester Transfer Protein TaqIB Variant, High-Density Lipoprotein Cholesterol Levels, Cardiovascular Risk, and Efficacy of Pravastatin Treatment: Individual Patient Meta-Analysis of 13 677 Subjects , 2005, Circulation.

[25]  Frederik Vercauteren,et al.  Somewhat Practical Fully Homomorphic Encryption , 2012, IACR Cryptol. ePrint Arch..

[26]  Ken Williams,et al.  Nuclear Magnetic Resonance Lipoprotein Abnormalities in Prediabetic Subjects in the Insulin Resistance Atherosclerosis Study , 2005, Circulation.

[27]  W. Herman,et al.  A multivariate logistic regression equation to screen for diabetes: development and validation. , 2002, Diabetes care.

[28]  M. Pencina,et al.  General Cardiovascular Risk Profile for Use in Primary Care: The Framingham Heart Study , 2008, Circulation.

[29]  H. Nussbaumer,et al.  Fast polynomial transform algorithms for digital convolution , 1980 .

[30]  Michael Naehrig,et al.  Improved Security for a Ring-Based Fully Homomorphic Encryption Scheme , 2013, IMACC.

[31]  Phong Q. Nguyen,et al.  BKZ 2.0: Better Lattice Security Estimates , 2011, ASIACRYPT.

[32]  Michael Naehrig,et al.  ML Confidential: Machine Learning on Encrypted Data , 2012, ICISC.

[33]  Craig Gentry,et al.  Homomorphic Evaluation of the AES Circuit , 2012, IACR Cryptol. ePrint Arch..

[34]  J. Hilbe Logistic Regression Models , 2009 .

[35]  Eran Halperin,et al.  Identifying Personal Genomes by Surname Inference , 2013, Science.

[36]  Frederik Vercauteren,et al.  Fully homomorphic SIMD operations , 2012, Designs, Codes and Cryptography.

[37]  Vinod Vaikuntanathan,et al.  Can homomorphic encryption be practical? , 2011, CCSW '11.