FedFA: Federated Feature Augmentation

Federated learning is a distributed paradigm that allows multiple parties to collaboratively train deep models without exchanging the raw data. However, the data distribution among clients is naturally non-i.i.d., which leads to severe degradation of the learnt model. The primary goal of this paper is to develop a robust federated learning algorithm to address feature shift in clients' samples, which can be caused by various factors, e.g., acquisition differences in medical imaging. To reach this goal, we propose FedFA to tackle federated learning from a distinct perspective of federated feature augmentation. FedFA is based on a major insight that each client's data distribution can be characterized by statistics (i.e., mean and standard deviation) of latent features; and it is likely to manipulate these local statistics globally, i.e., based on information in the entire federation, to let clients have a better sense of the underlying distribution and therefore alleviate local data bias. Based on this insight, we propose to augment each local feature statistic probabilistically based on a normal distribution, whose mean is the original statistic and variance quantifies the augmentation scope. Key to our approach is the determination of a meaningful Gaussian variance, which is accomplished by taking into account not only biased data of each individual client, but also underlying feature statistics characterized by all participating clients. We offer both theoretical and empirical justifications to verify the effectiveness of FedFA. Our code is available at https://github.com/tfzhou/FedFA.

[1]  Bohyung Han,et al.  Multi-Level Branched Regularization for Federated Learning , 2022, ICML.

[2]  Zhuo Lu,et al.  Generalized Federated Learning via Sharpness Aware Minimization , 2022, ICML.

[3]  S. Gong,et al.  Feature-Distribution Perturbation and Calibration for Generalized Person ReID , 2022, ArXiv.

[4]  Jun Liu,et al.  Uncertainty Modeling for Out-of-Distribution Generalization , 2022, ICLR.

[5]  Zirui Wang,et al.  HarmoFL: Harmonizing Local and Global Drifts in Federated Learning on Heterogeneous Medical Images , 2021, AAAI.

[6]  Venkatesh Saligrama,et al.  Federated Learning Based on Dynamic Regularization , 2021, ICLR.

[7]  K. Singhal,et al.  What Do We Mean by Generalization in Federated Learning? , 2021, ICLR.

[8]  Michael W. Mahoney,et al.  Noisy Feature Mixup , 2021, ICLR.

[9]  Timothy M. Hospedales,et al.  A Simple Feature Augmentation for Domain Generalization , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Eunho Yang,et al.  FedMix: Approximation of Mixup under Mean Augmented Federated Learning , 2021, ICLR.

[11]  Y. Qiao,et al.  Domain Generalization with MixStyle , 2021, ICLR.

[12]  Q. Dou,et al.  FedBN: Federated Learning on Non-IID Features via Local Batch Normalization , 2021, ICLR.

[13]  Ariel Kleiner,et al.  Sharpness-Aware Minimization for Efficiently Improving Generalization , 2020, ICLR.

[14]  Qinghua Liu,et al.  Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization , 2020, NeurIPS.

[15]  Matthew Willetts,et al.  Explicit Regularisation in Gaussian Noise Injections , 2020, NeurIPS.

[16]  Pheng-Ann Heng,et al.  Shape-aware Meta-learning for Generalizing Prostate MRI Segmentation to Unseen Domains , 2020, MICCAI.

[17]  Ali Jadbabaie,et al.  Robust Federated Learning: The Case of Affine Distribution Shifts , 2020, NeurIPS.

[18]  Manzil Zaheer,et al.  Adaptive Federated Optimization , 2020, ICLR.

[19]  Kilian Q. Weinberger,et al.  On Feature Normalization and Data Augmentation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Lequan Yu,et al.  MS-Net: Multi-Site Network for Improving Prostate Segmentation With Heterogeneous MRI Data , 2020, IEEE Transactions on Medical Imaging.

[21]  Sashank J. Reddi,et al.  SCAFFOLD: Stochastic Controlled Averaging for Federated Learning , 2019, ICML.

[22]  Tzu-Ming Harry Hsu,et al.  Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification , 2019, ArXiv.

[23]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[24]  Seong Joon Oh,et al.  CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Quoc V. Le,et al.  Unsupervised Data Augmentation for Consistency Training , 2019, NeurIPS.

[26]  J. Zico Kolter,et al.  Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.

[27]  Bo Wang,et al.  Moment Matching for Multi-Source Domain Adaptation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Ioannis Mitliagkas,et al.  Manifold Mixup: Better Representations by Interpolating Hidden States , 2018, ICML.

[29]  Suman Jana,et al.  Certified Robustness to Adversarial Examples with Differential Privacy , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[30]  Daniel Cremers,et al.  Regularization for Deep Learning: A Taxonomy , 2017, ArXiv.

[31]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[32]  Gregory Cohen,et al.  EMNIST: Extending MNIST to handwritten letters , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[33]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[35]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[36]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[38]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[39]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[40]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[41]  Yoshua Bengio,et al.  Better Mixing via Deep Representations , 2012, ICML.

[42]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[44]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[45]  Bernhard Schölkopf,et al.  Incorporating Invariances in Support Vector Learning Machines , 1996, ICANN.

[46]  Christopher M. Bishop,et al.  Current address: Microsoft Research, , 2022 .

[47]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[48]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[49]  Jason Weston,et al.  Vicinal Risk Minimization , 2000, NIPS.

[50]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.