The future of digital health with federated learning

Data-driven machine learning (ML) has emerged as a promising approach for building accurate and robust statistical models from medical data, which is collected in huge volumes by modern healthcare systems. Existing medical data is not fully exploited by ML primarily because it sits in data silos and privacy concerns restrict access to this data. However, without access to sufficient data, ML will be prevented from reaching its full potential and, ultimately, from making the transition from research to clinical practice. This paper considers key factors contributing to this issue, explores how federated learning (FL) may provide a solution for the future of digital health and highlights the challenges and considerations that need to be addressed.

[1]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[2]  Olaf Sporns,et al.  The Human Connectome: A Structural Description of the Human Brain , 2005, PLoS Comput. Biol..

[3]  Nick C Fox,et al.  The Alzheimer's disease neuroimaging initiative (ADNI): MRI methods , 2008, Journal of magnetic resonance imaging : JMRI.

[4]  Stephen M. Moore,et al.  The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository , 2013, Journal of Digital Imaging.

[5]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[6]  J. Grefenstette,et al.  A systematic review of barriers to data sharing in public health , 2014, BMC Public Health.

[7]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[8]  Polina Golland,et al.  BrainPrint: A discriminative characterization of brain morphology , 2015, NeuroImage.

[9]  Brian B. Avants,et al.  The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) , 2015, IEEE Transactions on Medical Imaging.

[10]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[11]  K. Tomczak,et al.  The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge , 2015, Contemporary oncology.

[12]  P. Elliott,et al.  UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age , 2015, PLoS medicine.

[13]  Peter Richtárik,et al.  Federated Optimization: Distributed Machine Learning for On-Device Intelligence , 2016, ArXiv.

[14]  Barnabás Póczos,et al.  Quantifying Differences and Similarities in Whole-Brain White Matter Architecture Using Local Connectome Fingerprints , 2016, bioRxiv.

[15]  Forrest N. Iandola,et al.  How to scale distributed deep learning? , 2016, ArXiv.

[16]  Ronald M. Summers,et al.  Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning , 2016, IEEE Transactions on Medical Imaging.

[17]  Nima Tajbakhsh,et al.  Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning? , 2016, IEEE Transactions on Medical Imaging.

[18]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[19]  Ronald M. Summers,et al.  ChestX-ray: Hospital-Scale Chest X-ray Database and Benchmarks on Weakly Supervised Classification and Localization of Common Thorax Diseases , 2019, Deep Learning and Convolutional Neural Networks for Medical Imaging and Clinical Informatics.

[20]  Kaiming He,et al.  Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.

[21]  C. Pal,et al.  Deep Learning: A Primer for Radiologists. , 2017, Radiographics : a review publication of the Radiological Society of North America, Inc.

[22]  Giuseppe Ateniese,et al.  Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning , 2017, CCS.

[23]  Jimeng Sun,et al.  Federated Tensor Factorization for Computational Phenotyping , 2017, KDD.

[24]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[25]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[26]  Chen Sun,et al.  Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  Christos Davatzikos,et al.  Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features , 2017, Scientific Data.

[28]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[29]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[30]  Fei Wang,et al.  Deep Learning in Medicine-Promise, Progress, and Challenges. , 2019, JAMA internal medicine.

[31]  Meyke Hermsen,et al.  1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset , 2018, GigaScience.

[32]  et al.,et al.  Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge , 2018, ArXiv.

[33]  Fei Wang,et al.  Privacy-Preserving Patient Similarity Learning in a Federated Environment: Development and Analysis , 2018, JMIR medical informatics.

[34]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.

[35]  Guocong Song,et al.  Collaborative Learning for Deep Neural Networks , 2018, NeurIPS.

[36]  Wei Shi,et al.  Federated learning of predictive models from federated Electronic Health Records , 2018, Int. J. Medical Informatics.

[37]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Spyridon Bakas,et al.  Multi-Institutional Deep Learning Modeling Without Sharing Patient Data: A Feasibility Study on Brain Tumor Segmentation , 2018, BrainLes@MICCAI.

[39]  Le Lu,et al.  DeepLesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning , 2018, Journal of medical imaging.

[40]  Geraint Rees,et al.  Clinically applicable deep learning for diagnosis and referral in retinal disease , 2018, Nature Medicine.

[41]  Bruce R. Rosen,et al.  Distributed deep learning networks among institutions for medical imaging , 2018, J. Am. Medical Informatics Assoc..

[42]  H. Brendan McMahan,et al.  Learning Differentially Private Recurrent Language Models , 2017, ICLR.

[43]  Nadav Israel,et al.  Overcoming Forgetting in Federated Learning on Non-IID Data , 2019, ArXiv.

[44]  Bingzhe Wu,et al.  P3SGD: Patient Privacy Preserving SGD for Regularizing Deep CNNs in Pathological Image Classification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Tara Javidi,et al.  Peer-to-peer Federated Learning on Graphs , 2019, ArXiv.

[46]  Yang Song,et al.  Beyond Inferring Class Representatives: User-Level Privacy Leakage From Federated Learning , 2018, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[47]  Ronald M. Summers,et al.  A large annotated medical image dataset for the development and evaluation of segmentation algorithms , 2019, ArXiv.

[48]  C. Langlotz,et al.  A Roadmap for Foundational Research on Artificial Intelligence in Medical Imaging: From the 2018 NIH/RSNA/ACR/The Academy Workshop. , 2019, Radiology.

[49]  Úlfar Erlingsson,et al.  The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks , 2018, USENIX Security Symposium.

[50]  Song Han,et al.  Deep Leakage from Gradients , 2019, NeurIPS.

[51]  Masafumi Yamazaki,et al.  Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.7 seconds , 2019, ArXiv.

[52]  Li Huang,et al.  Patient Clustering Improves Efficiency of Federated Machine Learning to predict mortality and hospital stay time using distributed Electronic Medical Records , 2019, J. Biomed. Informatics.

[53]  Daguang Xu,et al.  Privacy-preserving Federated Brain Tumour Segmentation , 2019, MLMI@MICCAI.

[54]  Miaofang Chi,et al.  Efficient upgrading of CO to C3 fuel using asymmetric C-C coupling active sites , 2019, Nature Communications.

[55]  Nassir Navab,et al.  BrainTorrent: A Peer-to-Peer Environment for Decentralized Federated Learning , 2019, ArXiv.

[56]  Cordelia Schmid,et al.  White-box vs Black-box: Bayes Optimal Strategies for Membership Inference , 2019, ICML.

[57]  James Y. Zou,et al.  Data Shapley: Equitable Valuation of Data for Machine Learning , 2019, ICML.

[58]  Tianjian Chen,et al.  Federated Machine Learning: Concept and Applications , 2019 .

[59]  Torsten Hoefler,et al.  Demystifying Parallel and Distributed Deep Learning , 2018, ACM Comput. Surv..

[60]  Christopher G Schwarz,et al.  Identification of Anonymous MRI Research Participants with Face-Recognition Software. , 2019, The New England journal of medicine.

[61]  Marc Cuggia,et al.  The French Health Data Hub and the German Medical Informatics Initiatives: Two National Projects to Promote Data Sharing in Healthcare , 2019, Yearbook of Medical Informatics.

[62]  Luc Rocher,et al.  Estimating the success of re-identifications in incomplete datasets using generative models , 2019, Nature Communications.

[63]  Amir Salman Avestimehr,et al.  FedNAS: Federated Deep Learning via Neural Architecture Search , 2020, ArXiv.

[64]  J. Duncan,et al.  Multi-site fMRI Analysis Using Privacy-preserving Federated Learning and Domain Adaptation: ABIDE Results , 2020, Medical Image Anal..

[65]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[66]  Xiang Li,et al.  On the Convergence of FedAvg on Non-IID Data , 2019, ICLR.

[67]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[68]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[69]  Peter B. Walker,et al.  Federated Learning for Healthcare Informatics , 2019, Journal of Healthcare Informatics Research.