Outlier Detection by Privacy-Preserving Ensemble Decision Tree U sing Homomorphic Encryption
暂无分享,去创建一个
One of the most important processes in big data analysis is outlier detection, where anomalies of observed data are detected and eliminated to improve the system performance. Many approaches to outlier detection have been proposed so far for classification and prediction purposes. This paper focus on the outlier detection under a practical circumstance such that multiple organizations possess different data sets of a specific task, while they cannot directly share with each other from a privacy point of view; that is, each organization is not allowed to provide their sensitive data to other entities but they are keen to cooperate in data analysis for some reasons. To address this issue, we present a new outlier detection approach to the data analysis for multiple organizations using a decision tree ensemble based on the so-called federated learning scheme. In this paper, we extend an existing outlier detection mode called Isolation Forest to the federated learning concept so that data protection of each entity can be achieved by introducing an additive homomorphic encryption. The experimental results for several benchmark data sets demonstrate that the proposed privacy-preserving Isolation Forest (pp-iForest) achieves stable classification performance, which is almost the same as that under a single-organization setting, even when the number of organization is increased.