Federating Medical Deep Learning Models from Private Jupyter Notebooks to Distributed Institutions

Deep learning-based algorithms have led to tremendous progress over the last years, but they face a bottleneck as their optimal development highly relies on access to large datasets. To mitigate this limitation, cross-silo federated learning has emerged as a way to train collaborative models among multiple institutions without having to share the raw data used for model training. However, although artificial intelligence experts have the expertise to develop state-of-the-art models and actively share their code through notebook environments, implementing a federated learning system in real-world applications entails significant engineering and deployment efforts. To reduce the complexity of federation setups and bridge the gap between federated learning and notebook users, this paper introduces a solution that leverages the Jupyter environment as part of the federated learning pipeline and simplifies its automation, the Notebook Federator. The feasibility of this approach is then demonstrated with a collaborative model solving a digital pathology image analysis task in which the federated model reaches an accuracy of 0.8633 on the test set, as compared to the centralized configurations for each institution obtaining 0.7881, 0.6514, and 0.8096, respectively. As a fast and reproducible tool, the proposed solution enables the deployment of a cross-country federated environment in only a few minutes.

[1]  C. Streba,et al.  Federated Learning Approach with Pre-Trained Deep Learning Models for COVID-19 Detection from Unsegmented CT images , 2022, Life.

[2]  Samir B. Patel,et al.  Image Augmentation Techniques for Mammogram Analysis , 2022, J. Imaging.

[3]  N. Li,et al.  Notebook‐as‐a‐VRE (NaaVRE): From private notebooks to a collaborative cloud virtual research environment , 2021, Softw. Pract. Exp..

[4]  Ming Y. Lu,et al.  Federated learning for computational pathology on gigapixel whole slide images , 2020, Medical Image Anal..

[5]  Ruikang K. Wang,et al.  Federated Learning for Microvasculature Segmentation and Diabetic Retinopathy Classification of OCT Data , 2021, Ophthalmology science.

[6]  Ming Y. Lu,et al.  Synthetic data in machine learning for medicine and healthcare , 2021, Nature Biomedical Engineering.

[7]  Y. J. Chai,et al.  Federated Learning for Thyroid Ultrasound Image Analysis to Protect Personal Information: Validation Study in a Real Health Care Environment , 2020, JMIR medical informatics.

[8]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[9]  Federated Learning Systems , 2021, Studies in Computational Intelligence.

[10]  Spyridon Bakas,et al.  Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data , 2020, Scientific Reports.

[11]  Micah J. Sheller,et al.  The future of digital health with federated learning , 2020, npj Digital Medicine.

[12]  J. Duncan,et al.  Multi-site fMRI Analysis Using Privacy-preserving Federated Learning and Domain Adaptation: ABIDE Results , 2020, Medical Image Anal..

[13]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[14]  Luc Rocher,et al.  Estimating the success of re-identifications in incomplete datasets using generative models , 2019, Nature Communications.

[15]  S. Lodha,et al.  Discordance in the histopathologic diagnosis of difficult melanocytic neoplasms in the clinical setting , 2008, Journal of cutaneous pathology.