Navigating Alignment for Non-identical Client Class Sets: A Label Name-Anchored Federated Learning Framework

Traditional federated classification methods, even those designed for non-IID clients, assume that each client annotates its local data with respect to the same universal class set. In this paper, we focus on a more general yet practical setting, non-identical client class sets, where clients focus on their own (different or even non-overlapping) class sets and seek a global model that works for the union of these classes. If one views classification as finding the best match between representations produced by data/label encoder, such heterogeneity in client class sets poses a new significant challenge -- local encoders at different clients may operate in different and even independent latent spaces, making it hard to aggregate at the server. We propose a novel framework, FedAlign, to align the latent spaces across clients from both label and data perspectives. From a label perspective, we leverage the expressive natural language class names as a common ground for label encoders to anchor class representations and guide the data encoder learning across clients. From a data perspective, during local training, we regard the global class representations as anchors and leverage the data points that are close/far enough to the anchors of locally-unaware classes to align the data encoders across clients. Our theoretical analysis of the generalization performance and extensive experiments on four real-world datasets of different tasks confirm that FedAlign outperforms various state-of-the-art (non-IID) federated classification methods.

[1]  Sun Kim,et al.  Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification , 2021, AAAI.

[2]  Hai Li,et al.  FedMask: Joint Computation and Communication-Efficient Personalized Federated Learning via Heterogeneous Masking , 2021, SenSys.

[3]  G. Xing,et al.  FedDL: Federated Learning via Dynamic Layer Sharing for Human Activity Recognition , 2021, SenSys.

[4]  H. Li,et al.  Hermes: an efficient federated learning framework for heterogeneous mobile clients , 2021, MobiCom.

[5]  Chuhan Wu,et al.  Efficient-FedRec: Efficient Federated Learning Framework for Privacy-Preserving News Recommendation , 2021, EMNLP.

[6]  De-chuan Zhan,et al.  FedRS: Federated Learning with Restricted Softmax for Label Distribution Non-IID Data , 2021, KDD.

[7]  Shuyuan Xu,et al.  FedCT: Federated Collaborative Transfer for Recommendation , 2021, SIGIR.

[8]  Xiaolin Gui,et al.  Federated Learning with Positive and Unlabeled Data , 2021, ICML.

[9]  Jian Liang,et al.  No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data , 2021, NeurIPS.

[10]  Jiayu Zhou,et al.  Data-Free Knowledge Distillation for Heterogeneous Federated Learning , 2021, ICML.

[11]  Weike Pan,et al.  FedRec++: Lossless Federated Recommendation with Explicit Feedback , 2021, AAAI.

[12]  Kaigui Bian,et al.  Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data , 2021, WWW.

[13]  Chenglin Li,et al.  Meta-HAR: Federated Representation Learning for Human Activity Recognition , 2021, WWW.

[14]  Bingsheng He,et al.  Model-Contrastive Federated Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Ilya Sutskever,et al.  Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[16]  Miao Pan,et al.  To Talk or to Work: Flexible Communication Compression for Energy Efficient Federated Learning over Heterogeneous Mobile Edge Devices , 2020, IEEE INFOCOM 2021 - IEEE Conference on Computer Communications.

[17]  Qinghua Liu,et al.  Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization , 2020, NeurIPS.

[18]  Sebastian U. Stich,et al.  Ensemble Distillation for Robust Model Fusion in Federated Learning , 2020, NeurIPS.

[19]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[20]  Yiqiang Chen,et al.  Multi-Layer Cross Loss Model for Zero-Shot Human Activity Recognition , 2020, PAKDD.

[21]  Yasaman Khazaeni,et al.  Federated Learning with Matched Averaging , 2020, ICLR.

[22]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[23]  Kate Saenko,et al.  Federated Adversarial Domain Adaptation , 2019, ICLR.

[24]  Dacheng Tao,et al.  Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation , 2019, NeurIPS.

[25]  Sashank J. Reddi,et al.  SCAFFOLD: Stochastic Controlled Averaging for Federated Learning , 2019, ICML.

[26]  Xiang Li,et al.  On the Convergence of FedAvg on Non-IID Data , 2019, ICLR.

[27]  Laura von Rueden,et al.  Informed Machine Learning – A Taxonomy and Survey of Integrating Prior Knowledge into Learning Systems , 2019, IEEE Transactions on Knowledge and Data Engineering.

[28]  Timothy D. Solberg,et al.  Expert-augmented machine learning , 2019, Proceedings of the National Academy of Sciences.

[29]  Mona Attariyan,et al.  Parameter-Efficient Transfer Learning for NLP , 2019, ICML.

[30]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[31]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[32]  Jimeng Sun,et al.  Explainable Prediction of Medical Codes from Clinical Text , 2018, NAACL.

[33]  Yu-Chiang Frank Wang,et al.  Multi-label Zero-Shot Learning with Structured Knowledge Graphs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Gert R. G. Lanckriet,et al.  Recognizing Detailed Human Context in the Wild from Smartphones and Smartwatches , 2016, IEEE Pervasive Computing.

[35]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[36]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[37]  Sanja Fidler,et al.  Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  Sanja Fidler,et al.  Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[39]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[40]  Omer Levy,et al.  Linguistic Regularities in Sparse and Explicit Word Representations , 2014, CoNLL.

[41]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[42]  Didier Stricker,et al.  Introducing a New Benchmarked Dataset for Activity Monitoring , 2012, 2012 16th International Symposium on Wearable Computers.

[43]  J. Bullinaria,et al.  Extracting semantic representations from word co-occurrence statistics: A computational study , 2007, Behavior research methods.

[44]  Jianbo Shi,et al.  Multiclass spectral clustering , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[45]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[46]  Norman Meuschke,et al.  news-please - A Generic News Crawler and Extractor , 2017, ISI.

[47]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[48]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[49]  Ana Margarida de Jesus,et al.  Improving Methods for Single-label Text Categorization , 2007 .

[50]  L. Brooke The National Library of Medicine. , 1980, Hospital libraries.

[51]  G. McLachlan Iterative Reclassification Procedure for Constructing An Asymptotically Optimal Rule of Allocation in Discriminant-Analysis , 1975 .