Federated Learning from Small Datasets

Federated learning allows multiple parties to collaboratively train a joint model without sharing local data. This enables applications of machine learning in settings of inherently distributed, undisclosable data such as in the medical domain. In practice, joint training is usually achieved by aggregating local models, for which local training objectives have to be in expectation similar to the joint (global) objective. Often, however, local datasets are so small that local objectives differ greatly from the global objective, resulting in federated learning to fail. We propose a novel approach that intertwines model aggregations with permutations of local models. The permutations expose each local model to a daisy chain of local datasets resulting in more efficient training in data-sparse domains. This enables training on extremely small local datasets, such as patient data across hospitals, while retaining the training efficiency and privacy benefits of federated learning.

[1]  Tassilo Klein,et al.  Differentially Private Federated Learning: A Client Level Perspective , 2017, ArXiv.

[2]  Farzin Haddadpour,et al.  On the Convergence of Local Descent Methods in Federated Learning , 2019, ArXiv.

[3]  S. Kwak,et al.  Central limit theorem: the cornerstone of modern statistics , 2017, Korean journal of anesthesiology.

[4]  Aryan Mokhtari,et al.  FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization , 2019, AISTATS.

[5]  Michael Kamp,et al.  Black-Box Parallelization for Machine Learning , 2020 .

[6]  Giovanni Neglia,et al.  Federated Multi-Task Learning under a Mixture of Distributions , 2021, NeurIPS.

[7]  H. Vincent Poor,et al.  On Safeguarding Privacy and Security in the Framework of Federated Learning , 2020, IEEE Network.

[8]  Gavriel Salomon,et al.  T RANSFER OF LEARNING , 1992 .

[9]  Xin Wang,et al.  Few-Shot Object Detection via Feature Reweighting , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Micah J. Sheller,et al.  The future of digital health with federated learning , 2020, npj Digital Medicine.

[11]  J. Tukey Mathematics and the Picturing of Data , 1975 .

[12]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[13]  Ali Jadbabaie,et al.  Robust Federated Learning: The Case of Affine Distribution Shifts , 2020, NeurIPS.

[14]  Thomas Gärtner,et al.  Effective Parallelisation for Machine Learning , 2017, NIPS.

[15]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[16]  Lingjuan Lyu,et al.  Threats to Federated Learning , 2020, Federated Learning.

[17]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[18]  Xavier Amatriain,et al.  Few-Shot Learning for Dermatological Disease Diagnosis , 2019, MLHC.

[19]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[20]  Zaïd Harchaoui,et al.  Robust Aggregation for Federated Learning , 2019, IEEE Transactions on Signal Processing.

[21]  Baihe Huang,et al.  FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Analysis , 2021, ICML.

[22]  Tomáš Horváth,et al.  Migrating Models: A Decentralized View on Federated Learning , 2021, PKDD/ECML Workshops.

[23]  Mohammad O. Wedyan,et al.  Augmentation in Healthcare: Augmented Biosignal Using Deep Learning and Tensor Representation , 2021, Journal of healthcare engineering.

[24]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[25]  Giovanni Felici,et al.  Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers , 2013, Int. J. Secur. Networks.

[26]  Márk Jelasity,et al.  Gossip-based aggregation in large dynamic networks , 2005, TOCS.

[27]  E. Lander,et al.  The Angiosarcoma Project: enabling genomic and clinical discoveries in a rare cancer through patient-partnered research , 2019, Nature Medicine.

[28]  Naftali Tishby,et al.  Bayes and Tukey Meet at the Center Point , 2004, COLT.

[29]  David Eppstein,et al.  Approximating center points with iterative Radon points , 1996, Int. J. Comput. Geom. Appl..

[30]  H. Vincent Poor,et al.  Federated Learning With Differential Privacy: Algorithms and Performance Analysis , 2019, IEEE Transactions on Information Forensics and Security.

[31]  Assaf Schuster,et al.  Communication-Efficient Distributed Online Prediction by Dynamic Model Synchronization , 2014, ECML/PKDD.

[32]  Xuemin Wang,et al.  A Survey of Text Data Augmentation , 2020, 2020 International Conference on Computer Communication and Network Security (CCNS).

[33]  Stefan Wrobel,et al.  Efficient Decentralized Deep Learning by Dynamic Model Averaging , 2018, ECML/PKDD.

[34]  Yasaman Khazaeni,et al.  Federated Learning with Matched Averaging , 2020, ICLR.

[35]  D. Khayat,et al.  Comprehensive integrative profiling of upper tract urothelial carcinomas , 2021, Genome biology.

[36]  P. Baldi,et al.  Searching for exotic particles in high-energy physics with deep learning , 2014, Nature Communications.

[37]  Yang Liu,et al.  BatchCrypt: Efficient Homomorphic Encryption for Cross-Silo Federated Learning , 2020, USENIX ATC.

[38]  Prateek Mittal,et al.  Analyzing Federated Learning through an Adversarial Lens , 2018, ICML.

[39]  J. Radon Mengen konvexer Körper, die einen gemeinsamen Punkt enthalten , 1921 .

[40]  Qi Dou,et al.  FedBN: Federated Learning on Non-IID Features via Local Batch Normalization , 2021, ICLR.

[41]  Lawrence Carin,et al.  FLOP: Federated Learning on Medical Datasets using Partial Networks , 2021, KDD.

[42]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[43]  Ruhul Amin,et al.  Challenges, Applications and Design Aspects of Federated Learning: A Survey , 2021, IEEE Access.