Privacy-Preserving Support Vector Machine Training Over Blockchain-Based Encrypted IoT Data in Smart Cities

Machine learning (ML) techniques have been widely used in many smart city sectors, where a huge amount of data is gathered from various (IoT) devices. As a typical ML model, support vector machine (SVM) enables efficient data classification and thereby finds its applications in real-world scenarios, such as disease diagnosis and anomaly detection. Training an SVM classifier usually requires a collection of labeled IoT data from multiple entities, raising great concerns about data privacy. Most of the existing solutions rely on an implicit assumption that the training data can be reliably collected from multiple data providers, which is often not the case in reality. To bridge the gap between ideal assumptions and realistic constraints, in this paper, we propose secureSVM, which is a privacy-preserving SVM training scheme over blockchain-based encrypted IoT data. We utilize the blockchain techniques to build a secure and reliable data sharing platform among multiple data providers, where IoT data is encrypted and then recorded on a distributed ledger. We design secure building blocks, such as secure polynomial multiplication and secure comparison, by employing a homomorphic cryptosystem, Paillier, and construct a secure SVM training algorithm, which requires only two interactions in a single iteration, with no need for a trusted third-party. Rigorous security analysis prove that the proposed scheme ensures the confidentiality of the sensitive data for each data provider as well as the SVM model parameters for data analysts. Extensive experiments demonstrates the efficiency of the proposed scheme.

[1]  Mohsen Guizani,et al.  Privacy-Preserving DDoS Attack Detection Using Cross-Domain Traffic in Software Defined Networks , 2018, IEEE Journal on Selected Areas in Communications.

[2]  Chi-Man Vong,et al.  Encrypted image classification based on multilayer extreme learning machine , 2017, Multidimens. Syst. Signal Process..

[3]  Ran Canetti,et al.  Security and Composition of Multiparty Cryptographic Protocols , 2000, Journal of Cryptology.

[4]  Xiaojiang Du,et al.  A survey of key management schemes in wireless sensor networks , 2007, Comput. Commun..

[5]  Yoshinori Aono,et al.  Scalable and Secure Logistic Regression via Homomorphic Encryption , 2016, IACR Cryptol. ePrint Arch..

[6]  Yoshinori Aono,et al.  Privacy-Preserving Logistic Regression with Distributed Data Sources via Homomorphic Encryption , 2016, IEICE Trans. Inf. Syst..

[7]  Martine De Cock,et al.  Fast, Privacy Preserving Linear Regression over Distributed Datasets based on Pre-Distributed Data , 2015, AISec@CCS.

[8]  Sheng Liu,et al.  Blockchain-Based Data Preservation System for Medical Data , 2018, Journal of Medical Systems.

[9]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[10]  Basit Shafiq,et al.  A Random Decision Tree Framework for Privacy-Preserving Data Mining , 2014, IEEE Transactions on Dependable and Secure Computing.

[11]  Baoli Ma,et al.  Secure Phrase Search for Intelligent Processing of Encrypted Data in Cloud-Based IoT , 2018, IEEE Internet of Things Journal.

[12]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[13]  Michael Naehrig,et al.  ML Confidential: Machine Learning on Encrypted Data , 2012, ICISC.

[14]  Francisco Javier González-Serrano,et al.  Training Support Vector Machines with privacy-protected data , 2017, Pattern Recognit..

[15]  Liehuang Zhu,et al.  Classification of Encrypted Traffic With Second-Order Markov Chains and Application Attribute Bigrams , 2017, IEEE Transactions on Information Forensics and Security.

[16]  Yehuda Lindell,et al.  Introduction to Modern Cryptography , 2004 .

[17]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[18]  Shengli Xie,et al.  Cognitive machine-to-machine communications: visions and potentials for the smart grid , 2012, IEEE Network.

[19]  Shafi Goldwasser,et al.  Machine Learning Classification over Encrypted Data , 2015, NDSS.

[20]  Xiaoxia Liu,et al.  Efficient and Privacy-Preserving Online Medical Prediagnosis Framework Using Nonlinear SVM , 2017, IEEE Journal of Biomedical and Health Informatics.

[21]  Anderson C. A. Nascimento,et al.  Efficient and Private Scoring of Decision Trees, Support Vector Machines and Logistic Regression Models Based on Pre-Computation , 2019, IEEE Transactions on Dependable and Secure Computing.

[22]  Chen Sun,et al.  Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Mohsen Guizani,et al.  An effective key management scheme for heterogeneous sensor networks , 2007, Ad Hoc Networks.

[24]  Oded Goldreich,et al.  Foundations of Cryptography: Volume 2, Basic Applications , 2004 .

[25]  Muttukrishnan Rajarajan,et al.  Privacy-Preserving Multi-Class Support Vector Machine for Outsourcing the Data Classification in Cloud , 2014, IEEE Transactions on Dependable and Secure Computing.

[26]  Mohsen Guizani,et al.  Transactions papers a routing-driven Elliptic Curve Cryptography based key management scheme for Heterogeneous Sensor Networks , 2009, IEEE Transactions on Wireless Communications.

[27]  Jianfeng Ma,et al.  Privacy-Preserving Patient-Centric Clinical Decision Support System on Naïve Bayesian Classification , 2016, IEEE Journal of Biomedical and Health Informatics.

[28]  Jiankun Hu,et al.  Cloud-Based Approximate Constrained Shortest Distance Queries Over Encrypted Graphs With Privacy Protection , 2018, IEEE Transactions on Information Forensics and Security.

[29]  Feng Gao,et al.  A Blockchain-Based Privacy-Preserving Payment Mechanism for Vehicle-to-Grid Networks , 2018, IEEE Network.