Distributed privacy-preserving P2P data mining via probabilistic neural network committee machines

This work describes a probabilistic neural network (PNN) committee machine for Peer-to-Peer data mining. The pattern neurons of the PNN committee are composed of locally trained class-specialized regularization network Peer classifiers. The training takes into account the asynchronous distributed and privacy-preserving requirements that can be met in P2P systems. The Peer classifiers are first trained in parallel based on their local data. While no local data exchange is possible among them, the peers can exchange their classifiers in the form of binaries, or agents. Then an asynchronous distributed computing P2P cycle is executed to construct a mutual validation matrix. The train set of one Peer becomes the validation set of the other and only average rates are returned back. From this matrix we demonstrate that it is possible to perform weight based ensemble selection of best peer members for every class and in this way to find class-specialized Peer modules for the committee machine.

[1]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[2]  Konstantinos G. Margaritis,et al.  A Regularization Network Committee Machine of Isolated Regularization Networks for Distributed Privacy Preserving Data Mining , 2012, AIAI.

[3]  Yelena Yesha,et al.  Data Mining: Next Generation Challenges and Future Directions , 2004 .

[4]  Hans G. C. Tråvén,et al.  A neural network approach to statistical pattern classification by 'semiparametric' estimation of probability density functions , 1991, IEEE Trans. Neural Networks.

[5]  Ran Wolff,et al.  Distributed Data Mining in Peer-to-Peer Networks , 2006, IEEE Internet Computing.

[6]  Chris Clifton,et al.  Privacy Preserving Naïve Bayes Classifier for Vertically Partitioned Data , 2004, SDM.

[7]  Giovanni Seni,et al.  Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions , 2010, Ensemble Methods in Data Mining.

[8]  Harris Drucker Fast Committee Machines for Regression and Classification , 1997, KDD.

[9]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[10]  Jaideep Vaidya,et al.  Privacy Preserving Naive Bayes Classifier for Horizontally Partitioned Data , 2003 .

[11]  Ling Liu,et al.  k nearest neighbor classification across multiple private databases , 2006, CIKM '06.

[12]  Lipo Wang,et al.  Data Mining With Computational Intelligence , 2006, IEEE Transactions on Neural Networks.

[13]  Xu Wu Research on Privacy Preservation in P2P Systems , 2011 .

[14]  T Poggio,et al.  Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks , 1990, Science.

[15]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[16]  Jaideep Vaidya,et al.  Privacy-preserving SVM using nonlinear kernels on horizontally partitioned data , 2006, SAC.

[17]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..