Modeling with Homophily Driven Heterogeneous Data in Gossip Learning

Training deep learning models on data distributed and local to edge devices such as mobile phones is a prominent recent research direction. In a Gossip Learning (GL) system, each participating device maintains a model trained on its local data and iteratively aggregates it with the models from its neighbours in a communication network. While the fully distributed operation in GL comes with natural advantages over the centralized orchestration in Federated Learning (FL), its convergence becomes particularly slow when the data distribution is heterogeneous and aligns with the clustered structure of the communication network. These characteristics are pervasive across practical applications as people with similar interests (thus producing similar data) tend to create communities. This paper proposes a data-driven neighbor weighting strategy for aggregating the models: this enables faster diffusion of knowledge across the communities in the network and leads to quicker convergence. We augment the method to make it computationally efficient and fair: the devices quickly converge to the same model. We evaluate our model on real and synthetic datasets that we generate using a novel generative model for communication networks with heterogeneous data. Our exhaustive empirical evaluation verifies that our proposed method attains a faster convergence rate than the baselines. For example, the median test accuracy for a decentralized bird image classifier application reaches 81% with our proposed method within 80 rounds, whereas the baseline only reaches 46%.

[1]  Thijs Vogels,et al.  Beyond spectral gap (extended): The role of the topology in decentralized learning , 2023, ArXiv.

[2]  Sebastian U. Stich,et al.  Data-heterogeneity-aware Mixing for Decentralized Learning , 2022, ArXiv.

[3]  Kun Yuan,et al.  Exponential Graph is Provably Efficient for Decentralized Deep Training , 2021, NeurIPS.

[4]  Sebastian U. Stich,et al.  RelaySum for Decentralized Deep Learning on Heterogeneous Data , 2021, NeurIPS.

[5]  K. Sopian,et al.  Design configuration and operational parameters of bi-fluid PVT collectors: an updated review , 2023, Environmental Science and Pollution Research.

[6]  De-chuan Zhan,et al.  FedRS: Federated Learning with Restricted Softmax for Label Distribution Non-IID Data , 2021, KDD.

[7]  Olof Mogren,et al.  Decentralized federated learning of deep neural networks on non-iid data , 2021, ArXiv.

[8]  Andreas Keller,et al.  Swarm Learning for decentralized and confidential clinical machine learning , 2021, Nature.

[9]  Kun Yuan,et al.  DecentLaM: Decentralized Momentum SGD for Large-batch Deep Training , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Martin Jaggi,et al.  Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data , 2021, ICML.

[11]  Keqiu Li,et al.  Applications of federated learning in smart cities: recent advances, taxonomy, and open challenges , 2021, Connect. Sci..

[12]  Daniela Paolotti,et al.  Using wearable proximity sensors to characterize social contact patterns in a village of rural Malawi , 2020, EPJ Data Sci..

[13]  Qi Zhu,et al.  Addressing Class Imbalance in Federated Learning , 2020, AAAI.

[14]  Martin Jaggi,et al.  A Unified Theory of Decentralized SGD with Changing Topology and Local Updates , 2020, ICML.

[15]  Don Towsley,et al.  Decentralized gradient methods: does topology matter? , 2020, AISTATS.

[16]  Aryan Mokhtari,et al.  Quantized Push-sum for Gossip and Decentralized Optimization over Directed Graphs , 2020, ArXiv.

[17]  Yasaman Khazaeni,et al.  Federated Learning with Matched Averaging , 2020, ICLR.

[18]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[19]  Lodovico Giaretta,et al.  Gossip Learning: Off the Beaten Path , 2019, 2019 IEEE International Conference on Big Data (Big Data).

[20]  Sashank J. Reddi,et al.  SCAFFOLD: Stochastic Controlled Averaging for Federated Learning , 2019, ICML.

[21]  Phillip B. Gibbons,et al.  The Non-IID Data Quagmire of Decentralized Machine Learning , 2019, ICML.

[22]  Andrzej Nowak,et al.  Homophily as a process generating social networks: insights from Social Distance Attachment model , 2019, J. Artif. Soc. Soc. Simul..

[23]  Xiang Li,et al.  On the Convergence of FedAvg on Non-IID Data , 2019, ICLR.

[24]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[25]  Sébastien Gambs,et al.  IOTFLA : A Secured and Privacy-Preserving Smart Home Architecture Implementing Federated Learning , 2019, 2019 IEEE Security and Privacy Workshops (SPW).

[26]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[27]  Xiangru Lian,et al.  D2: Decentralized Training over Decentralized Data , 2018, ICML.

[28]  A. Barrat,et al.  Can co-location be used as a proxy for face-to-face contacts? , 2017, EPJ Data Science.

[29]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[30]  Wei Zhang,et al.  Asynchronous Decentralized Parallel Stochastic Gradient Descent , 2017, ICML.

[31]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[32]  Wei Zhang,et al.  Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent , 2017, NIPS.

[33]  Marc Tommasi,et al.  Decentralized Collaborative Learning of Personalized Models over Networks , 2016, AISTATS.

[34]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[35]  Cecilia Mascolo,et al.  Keep Your Friends Close and Your Facebook Friends Closer: A Multiplex Network Approach to the Analysis of Offline and Online Social Ties , 2014, ICWSM.

[36]  Jure Leskovec,et al.  Image Labeling on a Network: Using Social-Network Metadata for Image Classification , 2012, ECCV.

[37]  Christophe Diot,et al.  Dissemination in opportunistic social networks: the role of temporal communities , 2012, MobiHoc '12.

[38]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[39]  A. Barabasi,et al.  Analysis of a large-scale weighted network of one-to-one human communication , 2007, physics/0702158.

[40]  A. Arenas,et al.  Models of social networks based on social distance attachment. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[41]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[42]  John N. Tsitsiklis,et al.  Problems in decentralized decision making and computation , 1984 .

[43]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[44]  J. Meigs,et al.  WHO Technical Report , 1954, The Yale Journal of Biology and Medicine.

[45]  A. Bellet,et al.  Yes, Topology Matters in Decentralized Optimization: Refined Convergence and Topology Learning under Heterogeneous Data , 2022, ArXiv.

[46]  Erick Lavoie,et al.  D-Cliques: Compensating NonIIDness in Decentralized Federated Learning with Topology , 2021, ArXiv.

[47]  Proceedings of Machine Learning and Systems 2019, MLSys 2019, Stanford, CA, USA, March 31 - April 2, 2019 , 2019, MLSys.

[48]  Anusha Lalitha,et al.  Fully Decentralized Federated Learning , 2018 .

[49]  Jennifer L. Glanville,et al.  BIRDS OF A FEATHER : Homophily in Social Networks , 2014 .

[50]  Davide Anguita,et al.  A Public Domain Dataset for Human Activity Recognition using Smartphones , 2013, ESANN.

[51]  David Kempe,et al.  Structure and Dynamics of Information in Networks , 2010 .

[52]  VoLUME Xxxix,et al.  THE AMERICAN JOURNAL OF SOCIOLOGY , 2010 .

[53]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[54]  M. McPherson,et al.  BIRDS OF A FEATHER: Homophily , 2001 .