Learning Task-Agnostic Embedding of Multiple Black-Box Experts for Multi-Task Model Fusion

Model fusion is an emerging study in collective learning where heterogeneous experts with private data and learning architectures need to combine their black-box knowledge for better performance. Existing literature achieves this via a local knowledge distillation scheme that transfuses the predictive patterns of each pre-trained expert onto a white-box imitator model, which can be incorporated efficiently into a global model. This scheme however does not extend to multi-task scenarios where different experts were trained to solve different tasks and only part of their distilled knowledge is relevant to a new task. To address this multi-task challenge, we develop a new fusion paradigm that represents each expert as a distribution over a spectrum of predictive prototypes, which are isolated from task-specific information encoded within the prototype distribution. The task-agnostic prototypes can then be reintegrated to generate a new model that solves a new task encoded with a different prototype distribution. The fusion and adaptation performance of the proposed framework is demonstrated empirically on several real-world benchmark datasets.

[1]  Mohan S. Kankanhalli,et al.  Active Learning Is Planning: Nonmyopic ε-Bayes-Optimal Active Learning of Gaussian Processes , 2014, ECML/PKDD.

[2]  Jimeng Sun,et al.  CHEER: Rich Model Helps Poor Model via Knowledge Infusion , 2020, ArXiv.

[3]  Kian Hsiang Low,et al.  Decentralized High-Dimensional Bayesian Optimization with Factor Graphs , 2017, AAAI.

[4]  Kian Hsiang Low,et al.  Gaussian process decentralized data fusion meets transfer learning in large-scale distributed cooperative perception , 2017, Autonomous Robots.

[5]  Sijia Liu,et al.  On the Design of Black-Box Adversarial Examples by Leveraging Gradient-Free Optimization and Operator Splitting Method , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[7]  Kian Hsiang Low,et al.  A Unifying Framework of Anytime Sparse Gaussian Process Regression Models with Stochastic Variational Inference for Big Data , 2015, ICML.

[8]  John Shawe-Taylor,et al.  Tighter PAC-Bayes Bounds , 2006, NIPS.

[9]  Bryan Kian Hsiang Low,et al.  Information-Based Multi-Fidelity Bayesian Optimization , 2017 .

[10]  Kian Hsiang Low,et al.  A Distributed Variational Inference Framework for Unifying Parallel Sparse Gaussian Process Regression Models , 2016, ICML.

[11]  Kian Hsiang Low,et al.  A General Framework for Interacting Bayes-Optimally with Self-Interested Agents using Arbitrary Parametric Model and Model Prior , 2013, IJCAI.

[12]  Jimeng Sun,et al.  CASTER: Predicting Drug Interactions with Chemical Substructure Representation , 2019, AAAI.

[13]  Kian Hsiang Low,et al.  Stochastic Variational Inference for Bayesian Sparse Gaussian Process Regression , 2017, 2019 International Joint Conference on Neural Networks (IJCNN).

[14]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[15]  Jimeng Sun,et al.  RDPD: Rich Data Helps Poor Data via Imitation , 2018, IJCAI.

[16]  Girish Chowdhary,et al.  Communication efficient decentralized Gaussian Process Fusion for multi-UAS path planning , 2017, 2017 American Control Conference (ACC).

[17]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[18]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[19]  Yasaman Khazaeni,et al.  Bayesian Nonparametric Federated Learning of Neural Networks , 2019, ICML.

[20]  Jimeng Sun,et al.  DDL: Deep Dictionary Learning for Predictive Phenotyping , 2019, IJCAI.

[21]  Kian Hsiang Low,et al.  Information-Theoretic Approach to Efficient Adaptive Path Planning for Mobile Robotic Environmental Sensing , 2009, ICAPS.

[22]  Kian Hsiang Low,et al.  Collective Model Fusion for Multiple Black-Box Experts , 2019, ICML.

[23]  Kian Hsiang Low,et al.  Parallel Gaussian Process Regression with Low-Rank Covariance Matrix Approximations , 2013, UAI.

[24]  Seong Joon Oh,et al.  Modeling Uncertainty with Hedged Instance Embedding , 2018, ICLR 2018.

[25]  Jonathan P. How,et al.  Near-Optimal Adversarial Policy Switching for Decentralized Asynchronous Multi-Agent Systems , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Kristjan H. Greenewald,et al.  Statistical Model Aggregation via Parameter Matching , 2019, NeurIPS.

[27]  Kian Hsiang Low,et al.  Adaptive Sampling for Multi-Robot Wide-Area Exploration , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[28]  Marc Peter Deisenroth,et al.  Distributed Gaussian Processes , 2015, ICML.

[29]  Kian Hsiang Low,et al.  Parallel Gaussian Process Regression for Big Data: Low-Rank Representation Meets Markov Approximation , 2014, AAAI.

[30]  Andrew L. Beam,et al.  Adversarial attacks on medical machine learning , 2019, Science.

[31]  Gaurav S. Sukhatme,et al.  Decentralized Data Fusion and Active Sensing with Mobile Sensors for Modeling and Predicting Spatiotemporal Traffic Phenomena , 2012, UAI.

[32]  David A. McAllester PAC-Bayesian model averaging , 1999, COLT '99.

[33]  Kian Hsiang Low,et al.  Gaussian Process-Based Decentralized Data Fusion and Active Sensing for Mobility-on-Demand System , 2013, Robotics: Science and Systems.

[34]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[35]  Kian Hsiang Low,et al.  A Generalized Stochastic Variational Bayesian Hyperparameter Learning Framework for Sparse Spectrum Gaussian Process Regression , 2016, AAAI.

[36]  Yoshua Bengio,et al.  Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.

[37]  François Laviolette,et al.  PAC-Bayesian learning of linear classifiers , 2009, ICML '09.

[38]  Kian Hsiang Low,et al.  Telesupervised remote surface water quality sensing , 2010, 2010 IEEE Aerospace Conference.

[39]  Kian Hsiang Low,et al.  Collective Online Learning of Gaussian Processes in Massive Multi-Agent Systems , 2019, AAAI.

[40]  Peter B. Walker,et al.  Federated Learning for Healthcare Informatics , 2019, Journal of Healthcare Informatics Research.