Privacy Preserving Naive Bayes Classifier for Horizontally Partitioned Data Using Secure Division

In order to extract interesting patterns, data available at multiple sites has to be trained. The data available in these sites should not be revealed while extorting patterns. Distributed Data mining enables sites to mine patterns based on the knowledge available at different sites. In the process of sites collaborating to develop a model, it is extremely important to protect the privacy of data or intermediate results. The features of the data maintained at each site are often similar in nature. In this paper, we design an improved privacypreserving distributed naive Bayesian classifier to train the horizontal data. This trained model is propagated to sites involved in computation to assist classify a new tuple. We further analyze the security and complexity of the algorithm.

[1]  Rebecca N. Wright,et al.  Privacy-preserving distributed k-means clustering over arbitrarily partitioned data , 2005, KDD '05.

[2]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[3]  Chris Clifton,et al.  Privacy-preserving Naïve Bayes classification , 2008, The VLDB Journal.

[4]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[5]  Sheng Zhong,et al.  Privacy preserving Back-propagation neural network learning over arbitrarily partitioned data , 2011, Neural Computing and Applications.

[6]  A. Yao,et al.  Fair exchange with a semi-trusted third party (extended abstract) , 1997, CCS '97.

[7]  Ali Miri,et al.  Privacy-preserving back-propagation and extreme learning machine algorithms , 2012, Data Knowl. Eng..

[8]  Taher ElGamal,et al.  A public key cyryptosystem and signature scheme based on discrete logarithms , 1985 .

[9]  Rebecca N. Wright,et al.  Privacy-preserving Bayesian network structure computation on distributed heterogeneous data , 2004, KDD.

[10]  Oded Goldreich,et al.  Foundations of Cryptography: General Cryptographic Protocols , 2004 .

[11]  Wei Zhao,et al.  A new scheme on privacy-preserving data classification , 2005, KDD '05.

[12]  Manuel Blum,et al.  An Efficient Probabilistic Public-Key Encryption Scheme Which Hides All Partial Information , 1985, CRYPTO.

[13]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[14]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[15]  Ali Miri,et al.  Privacy preserving ID3 using Gini Index over horizontally partitioned data , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[16]  S. Rajsbaum Foundations of Cryptography , 2014 .

[17]  Li Wan,et al.  Privacy-preservation for gradient descent methods , 2007, KDD '07.

[18]  Tomas Toft,et al.  On Secure Two-Party Integer Division , 2012, Financial Cryptography.

[19]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[20]  Taher El Gamal A public key cryptosystem and a signature scheme based on discrete logarithms , 1984, IEEE Trans. Inf. Theory.

[21]  Sheng Zhong,et al.  Privacy-Preserving Backpropagation Neural Network Learning , 2009, IEEE Transactions on Neural Networks.