论文信息 - A Light-Weight Crowdsourcing Aggregation in Privacy-Preserving Federated Learning System

A Light-Weight Crowdsourcing Aggregation in Privacy-Preserving Federated Learning System

Federated Machine Learning (FML) sheds light on secure distributed machine learning. However, generic FML methods may lead to privacy-leakage through the sharing of training information of individual models and have relatively poor performance when the training datasets for individual models are biased and diversified. This is a problem in combining models trained in different scenarios of IoT devices since the available training datasets are usually limited and biased. To tackle this problem, we propose a novel approach to precisely ensemble results from different models in distributed edge devices. Instead of passing the training information of individual models around that requires a relatively large amount of bandwidth and compromises data privacy, we suggest employing a trusted central agent that only collects different inference results from edge devices. Then based on a limited amount of labeled data, the agent runs a designed statistical iterative crowdsourcing algorithm to combine results for a more accurate aggregated prediction towards a user query. Our proposed system model, "Privacy-Preserving Federated Learning System", together with our light-weight Secure Crowdsourcing Aggregation (SC-Agg) algorithm, provide a more accurate prediction for outside queries at little cost without any prior knowledge of what query will be submitted. We experimentally verify that in our system, SC-Agg consistently outperforms the majority voting method and the best performing model of the ensemble in all testing scenarios. We believe that SC-Agg fits the real-world IoT applications better than other methods, such as the vanilla majority voting, for its robustness and better performance.

[1] Song Han,et al. Deep Leakage from Gradients , 2019, NeurIPS.

[2] L. Myers,et al. Spearman Correlation Coefficients, Differences between , 2004 .

[3] Qiang Yang,et al. Federated Machine Learning , 2019, ACM Trans. Intell. Syst. Technol..

[4] Yue Zhao,et al. Federated Learning with Non-IID Data , 2018, ArXiv.

[5] Bin Bi,et al. Iterative Learning for Reliable Crowdsourcing Systems , 2012 .

[6] Peter Richtárik,et al. Federated Optimization: Distributed Machine Learning for On-Device Intelligence , 2016, ArXiv.

[7] Vitaly Shmatikov,et al. Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[8] Blaise Agüera y Arcas,et al. Federated Learning of Deep Networks using Model Averaging , 2016, ArXiv.

[9] Qiang Yang,et al. Cross-task crowdsourcing , 2013, KDD.

[10] Philip S. Yu,et al. Deep Learning towards Mobile Applications , 2018, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[11] Peter Richtárik,et al. Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[12] Devavrat Shah,et al. Iterative Learning for Reliable Crowdsourcing Systems , 2011, NIPS.

[13] Shipeng Yu,et al. Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks , 2012, J. Mach. Learn. Res..

[14] Amir Houmansadr,et al. Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[15] Guosheng Lin,et al. On lightweight privacy-preserving collaborative learning for internet-of-things objects , 2019, IoTDI.

[16] Vitaly Shmatikov,et al. How To Backdoor Federated Learning , 2018, AISTATS.

[17] Bart Selman,et al. Noise Strategies for Improving Local Search , 1994, AAAI.