LoAdaBoost: Loss-based AdaBoost federated machine learning with reduced computational complexity on IID and non-IID intensive care data

Intensive care data are valuable for improvement of health care, policy making and many other purposes. Vast amount of such data are stored in different locations, on many different devices and in different data silos. Sharing data among different sources is a big challenge due to regulatory, operational and security reasons. One potential solution is federated machine learning, which is a method that sends machine learning algorithms simultaneously to all data sources, trains models in each source and aggregates the learned models. This strategy allows utilization of valuable data without moving them. One challenge in applying federated machine learning is the possibly different distributions of data from diverse sources. To tackle this problem, we proposed an adaptive boosting method named LoAdaBoost that increases the efficiency of federated machine learning. Using intensive care unit data from hospitals, we investigated the performance of learning in IID and non-IID data distribution scenarios, and showed that the proposed LoAdaBoost method achieved higher predictive accuracy with lower computational complexity than the baseline method.

[1]  Kenneth D. Mandl,et al.  Confederated Machine Learning on Horizontally and Vertically Separated Medical Data for Large-Scale Health System Intelligence , 2019, ArXiv.

[2]  Dmitriy Dligach,et al.  Two-stage Federated Phenotyping and Patient Representation Learning , 2019, BioNLP@ACL.

[3]  Li Huang,et al.  Patient Clustering Improves Efficiency of Federated Machine Learning to predict mortality and hospital stay time using distributed Electronic Medical Records , 2019, J. Biomed. Informatics.

[4]  Ming Zheng,et al.  Artificial neural networks condensation: A strategy to facilitate adaption of machine learning in medical settings by reducing computational burden , 2018, ArXiv.

[5]  Kenneth D. Mandl,et al.  FADL: Federated-Autonomous Deep Learning for Distributed Electronic Health Record , 2018, ArXiv.

[6]  Alistair E. W. Johnson,et al.  The eICU Collaborative Research Database, a freely available multi-center database for critical care research , 2018, Scientific Data.

[7]  Vitaly Shmatikov,et al.  How To Backdoor Federated Learning , 2018, AISTATS.

[8]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.

[9]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[10]  Sarvar Patel,et al.  Practical Secure Aggregation for Federated Learning on User-Held Data , 2016, ArXiv.

[11]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[12]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[13]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[14]  Jakub Konecný,et al.  Federated Optimization: Distributed Optimization Beyond the Datacenter , 2015, ArXiv.

[15]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16]  Peter Richtárik,et al.  Fast distributed coordinate descent for non-strongly convex losses , 2014, 2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[17]  Peter Richtárik,et al.  Distributed Coordinate Descent Method for Learning with Big Data , 2013, J. Mach. Learn. Res..

[18]  Saeed Ghadimi,et al.  Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..

[19]  Ohad Shamir,et al.  Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization , 2011, ICML.

[20]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[21]  Daniel R. Rehak,et al.  A Model and Infrastructure for Federated Learning Content Repositories , 2005 .

[22]  25th Annual Conference on Learning Theory Distributed Learning, Communication Complexity and Privacy , 2022 .

[23]  João Carlos Gluz,et al.  Interdisciplinary Journal of E-learning and Learning Objects an Agent-based Federated Learning Object Search Service , 2022 .