论文信息 - Simple, Efficient and Convenient Decentralized Multi-task Learning for Neural Networks

Simple, Efficient and Convenient Decentralized Multi-task Learning for Neural Networks

Artificial intelligence relying on machine learning is increasingly used on small, personal, network-connected devices such as smartphones and vocal assistants, and these applications will likely evolve with the development of the Internet of Things. The learning process requires a lot of data, often real users’ data, and computing power. Decentralized machine learning can help to protect users’ privacy by keeping sensitive training data on users’ devices, and has the potential to alleviate the cost born by service providers by off-loading some of the learning effort to user devices. Unfortunately, most approaches proposed so far for distributed learning with neural network are mono-task, and do not transfer easily to multi-tasks problems, for which users seek to solve related but distinct learning tasks and the few existing multi-task approaches have serious limitations. In this paper, we propose a novel learning method for neural networks that is decentralized, multitask, and keeps users’ data local. Our approach works with different learning algorithms, on various types of neural networks. We formally analyze the convergence of our method, and we evaluate its efficiency in different situations on various kind of neural networks, with different learning algorithms, thus demonstrating its benefits in terms of learning quality and convergence.

Amaury Bouchra Pilet | Davide Frey | François Taïani

[1] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[2] Paul Smolensky,et al. Information processing in dynamical systems: foundations of harmony theory , 1986 .

[3] Jack Mostow,et al. Direct Transfer of Learned Information Among Neural Networks , 1991, AAAI.

[4] Thomas G. Dietterich,et al. In Advances in Neural Information Processing Systems 12 , 1991, NIPS 1991.

[5] Patrick van der Smagt,et al. Introduction to neural networks , 1995, The Lancet.

[6] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[7] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[8] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .

[9] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[10] Márk Jelasity,et al. Gossip-based aggregation in large dynamic networks , 2005, TOCS.

[11] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.