Overcoming Noisy Labels in Federated Learning Through Local Self-Guiding

Federated Learning (FL) is a privacy-preserving machine learning paradigm that enables clients such as Internet of Things (IoT) devices, and smartphones, to train a high-performance global model jointly. However, in real-world FL deployments, carefully human-annotated labels are expensive and time-consuming. So the presence of incorrect labels (noisy labels) in the local training data of the clients is inevitable, which will cause the performance degradation of the global model. To tackle this problem, we propose a simple but effective method Local Self-Guiding (LSG) to let clients guide themselves during training in the presence of noisy labels. Specifically, LSG keeps the model from memorizing noisy labels by enhancing the confidence of model predictions. Meanwhile, it utilizes the knowledge from local historical models which haven't fit noisy patterns to extract potential ground truth labels of samples. To keep the knowledge without storing models, LSG records the exponential moving average (EMA) of model output logits at different local training epochs as self-ensemble logits on clients' devices, which will lead to negligible computation and storage overhead. Then logit-based knowledge distillation is conducted to guide the local training. Experiments on MNIST, Fashion-MNIST, CIFAR-10, ImageNet-100 with multiple noise levels, and an unbalanced noisy dataset, Clothing1M, demonstrate the resistance of LSG to noisy labels. The code of LSG is available at https://github.com/DaokuanBai/LSG-Main

[1]  Jinli Cao,et al.  Knowledge-Driven Cybersecurity Intelligence: Software Vulnerability Coexploitation Behavior Discovery , 2023, IEEE Transactions on Industrial Informatics.

[2]  Sheng Sun,et al.  Towards Federated Learning against Noisy Labels via Local Self-Regularization , 2022, CIKM.

[3]  Tony Q. S. Quek,et al.  FedCorr: Multi-Stage Federated Learning for Label Noise Correction , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Zhi-hui Zhan,et al.  Distributed Differential Evolution With Adaptive Resource Allocation , 2022, IEEE Transactions on Cybernetics.

[5]  Miao Yang,et al.  Client Selection for Federated Learning With Label Noise , 2022, IEEE Transactions on Vehicular Technology.

[6]  Ramasuri Narayanam,et al.  Game of Gradients: Mitigating Irrelevant Clients in Federated Learning , 2021, AAAI.

[7]  Samy Bengio,et al.  Understanding deep learning (still) requires rethinking generalization , 2021, Commun. ACM.

[8]  Pheng-Ann Heng,et al.  Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise , 2020, AAAI.

[9]  Hwanjun Song,et al.  Robust Learning by Self-Transition for Handling Noisy Labels , 2020, KDD.

[10]  Changick Kim,et al.  Robust Federated Learning With Noisy Labels , 2020, IEEE Intelligent Systems.

[11]  Hwanjun Song,et al.  Learning From Noisy Labels With Deep Neural Networks: A Survey , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Shaogang Gong,et al.  Peer Collaborative Learning for Online Knowledge Distillation , 2020, AAAI.

[13]  Hua Wang,et al.  Microaggregation Sorting Framework for K-Anonymity Statistical Disclosure Control in Cloud Computing , 2020, IEEE Transactions on Cloud Computing.

[14]  Lei Feng,et al.  Combating Noisy Labels by Agreement: A Joint Training Method with Co-Regularization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Junnan Li,et al.  DivideMix: Learning with Noisy Labels as Semi-supervised Learning , 2020, ICLR.

[16]  Han Yu,et al.  FOCUS: Dealing with Label Quality Disparity in Federated Learning , 2020, Federated Learning.

[17]  Kin K. Leung,et al.  Overcoming Noisy and Irrelevant Data in Federated Learning , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).

[18]  G. Algan,et al.  Image Classification with Deep Learning in the Presence of Noisy Labels: A Survey , 2019, Knowl. Based Syst..

[19]  S. Warfield,et al.  Deep learning with noisy labels: exploring techniques and remedies in medical image analysis , 2019, Medical Image Anal..

[20]  Thomas Brox,et al.  SELF: Learning to Filter Noisy Labels with Self-Ensembling , 2019, ICLR.

[21]  James Bailey,et al.  Symmetric Cross Entropy for Robust Learning With Noisy Labels , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Phillip Isola,et al.  Contrastive Multiview Coding , 2019, ECCV.

[23]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[24]  Noel E. O'Connor,et al.  Unsupervised label noise modeling and loss correction , 2019, ICML.

[25]  Hubert Eichner,et al.  Towards Federated Learning at Scale: System Design , 2019, SysML.

[26]  Alan L. Yuille,et al.  Snapshot Distillation: Teacher-Student Optimization in One Generation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Jordi Pont-Tuset,et al.  The Open Images Dataset V4 , 2018, International Journal of Computer Vision.

[28]  Mert R. Sabuncu,et al.  Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels , 2018, NeurIPS.

[29]  Masashi Sugiyama,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[30]  Kiyoharu Aizawa,et al.  Joint Optimization Framework for Learning with Noisy Labels , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Li Fei-Fei,et al.  MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[32]  Richard Nock,et al.  Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption , 2017, ArXiv.

[33]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[34]  Blake Anderson,et al.  Machine Learning for Encrypted Malware Traffic Classification: Accounting for Noisy Labels and Non-Stationarity , 2017, KDD.

[35]  Dacheng Tao,et al.  Learning from Multiple Teacher Networks , 2017, KDD.

[36]  Yoshua Bengio,et al.  A Closer Look at Memorization in Deep Networks , 2017, ICML.

[37]  Shai Shalev-Shwartz,et al.  Decoupling "when to update" from "how to update" , 2017, NIPS.

[38]  Harri Valpola,et al.  Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[39]  Timo Aila,et al.  Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[40]  Richard Nock,et al.  Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[42]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Xiaogang Wang,et al.  Learning from massive noisy labeled data for image classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Aditya Krishna Menon,et al.  Learning with Symmetric Label Noise: The Importance of Being Unhinged , 2015, NIPS.

[45]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[46]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[47]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Jeff A. Bilmes,et al.  Robust Curriculum Learning: from clean label detection to noisy label self-correction , 2021, ICLR.

[49]  Tanima Dutta,et al.  Impact of Noisy Labels in Learning Techniques: A Survey , 2020 .

[50]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[51]  Zien,et al.  Semi-Supervised Learning , 2009 .

[52]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.