Multi-Agent Distributed Lifelong Learning for Collective Knowledge Acquisition

Lifelong machine learning methods acquire knowledge over a series of consecutive tasks, continually building upon their experience. Current lifelong learning algorithms rely upon a single learning agent that has centralized access to all data. In this paper, we extend the idea of lifelong learning from a single agent to a network of multiple agents that collectively learn a series of tasks. Each agent faces some (potentially unique) set of tasks; the key idea is that knowledge learned from these tasks may benefit other agents trying to learn different (but related) tasks. Our Collective Lifelong Learning Algorithm (CoLLA) provides an efficient way for a network of agents to share their learned knowledge in a distributed and decentralized manner, while preserving the privacy of the locally observed data. Note that a decentralized scheme is a subclass of distributed algorithms where a central server does not exist and in addition to data, computations are also distributed among the agents. We provide theoretical guarantees for robust performance of the algorithm and empirically demonstrate that CoLLA outperforms existing approaches for distributed multi-task learning on a variety of data sets.

[1]  Daoqiang Zhang,et al.  Multi-Modal Multi-Task Learning for Joint Prediction of Clinical Scores in Alzheimer's Disease , 2011, MBIA.

[2]  Feng Yan,et al.  Distributed Autonomous Online Learning: Regrets and Intrinsic Privacy-Preserving Properties , 2010, IEEE Transactions on Knowledge and Data Engineering.

[3]  Iain D. Couzin,et al.  Collective Learning and Optimal Consensus Decisions in Social Animal Groups , 2014, PLoS Comput. Biol..

[4]  Jingrui He,et al.  A Graphbased Framework for Multi-Task Multi-View Learning , 2011, ICML.

[5]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[6]  Albert Y. Zomaya,et al.  Evolutionary Scheduling of Dynamic Multitasking Workloads for Big-Data Analytics in Elastic Cloud , 2014, IEEE Transactions on Emerging Topics in Computing.

[7]  João M. F. Xavier,et al.  D-ADMM: A Communication-Efficient Distributed Algorithm for Separable Optimization , 2012, IEEE Transactions on Signal Processing.

[8]  Sinno Jialin Pan,et al.  Distributed Multi-Task Relationship Learning , 2017, KDD.

[9]  Jie Chen,et al.  Multitask Diffusion Adaptation Over Networks , 2013, IEEE Transactions on Signal Processing.

[10]  P. Lenk,et al.  Hierarchical Bayes Conjoint Analysis: Recovery of Partworth Heterogeneity from Reduced Experimental Designs , 1996 .

[11]  R. Tibshirani The Lasso Problem and Uniqueness , 2012, 1206.0313.

[12]  Alexander J. Smola,et al.  Communication Efficient Distributed Machine Learning with the Parameter Server , 2014, NIPS.

[13]  Yaoliang Yu,et al.  Petuum: A New Platform for Distributed Machine Learning on Big Data , 2013, IEEE Transactions on Big Data.

[14]  Thomas Jansen,et al.  Exploring the Explorative Advantage of the Cooperative Coevolutionary (1+1) EA , 2003, GECCO.

[15]  Xiaoming Yuan,et al.  A Note on the Alternating Direction Method of Multipliers , 2012, J. Optim. Theory Appl..

[16]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[17]  James T. Kwok,et al.  Asynchronous Distributed ADMM for Consensus Optimization , 2014, ICML.

[18]  Volkan Cevher,et al.  Convex Optimization for Big Data: Scalable, randomized, and parallel algorithms for big data analytics , 2014, IEEE Signal Processing Magazine.

[19]  Hal Daumé,et al.  Learning Task Grouping and Overlap in Multi-task Learning , 2012, ICML.

[20]  Lawrence Carin,et al.  Multi-Task Learning for Classification with Dirichlet Process Priors , 2007, J. Mach. Learn. Res..

[21]  Joelle Pineau,et al.  Generalized Dictionary for Multitask Learning with Boosting , 2016, IJCAI.

[22]  Sebastian Thrun,et al.  Is Learning The n-th Thing Any Easier Than Learning The First? , 1995, NIPS.

[23]  Maja Pantic,et al.  The first facial expression recognition and analysis challenge , 2011, Face and Gesture 2011.

[24]  Eric Eaton,et al.  ELLA: An Efficient Lifelong Learning Algorithm , 2013, ICML.

[25]  Giuseppe De Nicolao,et al.  Client–Server Multitask Learning From Distributed Datasets , 2008, IEEE Transactions on Neural Networks.

[26]  Mladen Kolar,et al.  Distributed Multi-Task Learning , 2016, AISTATS.

[27]  Jiayu Zhou,et al.  Asynchronous Multi-task Learning , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[28]  Kilian Q. Weinberger,et al.  Large Margin Multi-Task Metric Learning , 2010, NIPS.

[29]  Jiayu Zhou,et al.  Privacy-Preserving Distributed Multi-Task Learning with Asynchronous Updates , 2017, KDD.

[30]  Fuzhen Zhuang,et al.  Collaborating between Local and Global Learning for Distributed Online Multiple Tasks , 2015, CIKM.

[31]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[32]  Bing Liu,et al.  Topic Modeling using Topics from Many Domains, Lifelong Learning and Big Data , 2014, ICML.

[33]  Haitham Bou-Ammar,et al.  Scalable Multitask Policy Gradient Reinforcement Learning , 2017, AAAI.

[34]  C. Jack,et al.  Alzheimer's Disease Neuroimaging Initiative , 2008 .

[35]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[36]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[37]  David Mateos-Núñez,et al.  Distributed optimization for multi-task learning via nuclear-norm approximation , 2015 .

[38]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[39]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[40]  Ali H. Sayed,et al.  Diffusion Adaptation Strategies for Distributed Optimization and Learning Over Networks , 2011, IEEE Transactions on Signal Processing.

[41]  Massimiliano Pontil,et al.  Sparse coding for multitask and transfer learning , 2012, ICML.

[42]  Daoqiang Zhang,et al.  Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer's disease , 2012, NeuroImage.

[43]  Jason R. Marden,et al.  Designing games for distributed optimization , 2011, IEEE Conference on Decision and Control and European Control Conference.

[44]  Luo Si,et al.  Adaptive Knowledge Transfer for Multiple Instance Learning in Image Classification , 2014, AAAI.