Research on Differentially Private Bayesian Classification Algorithm for Data Streams

Considering the privacy leakage risk of data streams mining, a differential private Naive Bayes classification algorithm for data streams (DP-NB) is proposed in the paper. The Gaussian Bayesian model is firstly adopted and the methods of feature selection, cross-validation and weighting mechanism are used to improve the model and solve the problem of concept drift. Then noise perturbation is added to the model parameters and incremental update is presented to optimize the Gaussian classifier model. Theoretical analysis of the DP-NB algorithm proves that it achieves $\mathcal{E}-$ differential privacy. Finally, experimental results show that the DP-NB algorithm has better classification accuracy while satisfying privacy protection.