Logistic Regression on Homomorphic Encrypted Data at Scale

Machine learning on (homomorphic) encrypted data is a cryptographic method for analyzing private and/or sensitive data while keeping privacy. In the training phase, it takes as input an encrypted training data and outputs an encrypted model without ever decrypting. In the prediction phase, it uses the encrypted model to predict results on new encrypted data. In each phase, no decryption key is needed, and thus the data privacy is ultimately guaranteed. It has many applications in various areas such as finance, education, genomics, and medical field that have sensitive private data. While several studies have been reported on the prediction phase, few studies have been conducted on the training phase.In this paper, we present an efficient algorithm for logistic regression on homomorphic encrypted data, and evaluate our algorithm on real financial data consisting of 422,108 samples over 200 features. Our experiment shows that an encrypted model with a sufficient Kolmogorov Smirnow statistic value can be obtained in ∼17 hours in a single machine. We also evaluate our algorithm on the public MNIST dataset, and it takes ∼2 hours to learn an encrypted model with 96.4% accuracy. Considering the inefficiency of homomorphic encryption, our result is encouraging and demonstrates the practical feasibility of the logistic regression training on large encrypted data, for the first time to the best of our knowledge.

[1]  Léo Ducas,et al.  FHEW: Bootstrapping Homomorphic Encryption in Less Than a Second , 2015, EUROCRYPT.

[2]  Payman Mohassel,et al.  SecureML: A System for Scalable Privacy-Preserving Machine Learning , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[3]  Zvika Brakerski,et al.  Fully Homomorphic Encryption without Modulus Switching from Classical GapSVP , 2012, CRYPTO.

[4]  Ivan Damgård,et al.  Multiparty Computation from Threshold Homomorphic Encryption , 2000, EUROCRYPT.

[5]  A. Yao,et al.  Fair exchange with a semi-trusted third party (extended abstract) , 1997, CCS '97.

[6]  Pratibha Mishra,et al.  Advanced Engineering Mathematics , 2013 .

[7]  Amit Sahai,et al.  Threshold Fully Homomorphic Encryption , 2017, IACR Cryptol. ePrint Arch..

[8]  Jung Hee Cheon,et al.  Efficient Logistic Regression on Large Encrypted Data , 2018, IACR Cryptol. ePrint Arch..

[9]  Nicolas Gama,et al.  Improving TFHE: faster packed homomorphic operations and efficient circuit bootstrapping , 2017, IACR Cryptol. ePrint Arch..

[10]  Jin Li,et al.  Privacy-preserving outsourced classification in cloud computing , 2017, Cluster Computing.

[11]  Stratis Ioannidis,et al.  Privacy-Preserving Ridge Regression on Hundreds of Millions of Records , 2013, 2013 IEEE Symposium on Security and Privacy.

[12]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[13]  Pascal Paillier,et al.  Fast Homomorphic Evaluation of Deep Discretized Neural Networks , 2018, IACR Cryptol. ePrint Arch..

[14]  Xiaoqian Jiang,et al.  Secure Logistic Regression Based on Homomorphic Encryption: Design and Evaluation , 2018, IACR Cryptol. ePrint Arch..

[15]  Michael Naehrig,et al.  ML Confidential: Machine Learning on Encrypted Data , 2012, ICISC.

[16]  Frederik Vercauteren,et al.  Somewhat Practical Fully Homomorphic Encryption , 2012, IACR Cryptol. ePrint Arch..

[17]  Anantha Chandrakasan,et al.  Gazelle: A Low Latency Framework for Secure Neural Network Inference , 2018, IACR Cryptol. ePrint Arch..

[18]  Jung Hee Cheon,et al.  Homomorphic Encryption for Arithmetic of Approximate Numbers , 2017, ASIACRYPT.

[19]  Michael Naehrig,et al.  CryptoNets: applying neural networks to encrypted data with high throughput and accuracy , 2016, ICML 2016.

[20]  Jung Hee Cheon,et al.  Logistic regression model training based on the approximate homomorphic encryption , 2018, BMC Medical Genomics.

[21]  Yoshinori Aono,et al.  Scalable and Secure Logistic Regression via Homomorphic Encryption , 2016, IACR Cryptol. ePrint Arch..

[22]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[23]  Craig Gentry,et al.  (Leveled) fully homomorphic encryption without bootstrapping , 2012, ITCS '12.