Multi-Task Learning with Capsule Networks

Multi-task learning is a machine learning approach learning multiple tasks jointly while exploiting commonalities and differences across tasks. A shared representation is learned by multi-task learning, and what is learned for each task can help other tasks be learned better. Most of existing multi-task learning methods adopt deep neural network as the classifier of each task. However, a deep neural network can exploit its strong curve-fitting capability to achieve high accuracy in training data even when the learned representation is not good enough. This is contradictory to the purpose of multi-task learning. In this paper, we propose a framework named multi-task capsule (MT-Capsule) which improves multi-task learning with capsule network. Capsule network is a new architecture which can intelligently model part-whole relationships to constitute viewpoint invariant knowledge and automatically extend the learned knowledge to different new scenarios. The experimental results on large real-world datasets show MT-Capsule can significantly outperform the state-of-the-art methods.

[1]  Geoffrey E. Hinton,et al.  Transforming Autoencoders , 2011 .

[2]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[3]  William M. Campbell,et al.  Mutual Information in Learning Feature Transformations , 2000, ICML.

[4]  Ping Zhong,et al.  The aLS-SVM based multi-task learning classifiers , 2017, Applied Intelligence.

[5]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  Ying Zhang,et al.  Text Emotion Distribution Learning via Multi-Task Convolutional Neural Network , 2018, IJCAI.

[8]  Jaeyoung Kim,et al.  Text Classification using Capsules , 2018, Neurocomputing.

[9]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[10]  Xuanjing Huang,et al.  Multi-Timescale Long Short-Term Memory Neural Network for Modelling Sentences and Documents , 2015, EMNLP.

[11]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[12]  Yoshua Bengio,et al.  End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Xuanjing Huang,et al.  Recurrent Neural Network for Text Classification with Multi-Task Learning , 2016, IJCAI.

[14]  Ronald A. Cole,et al.  Spoken Letter Recognition , 1990, HLT.

[15]  Xuanjing Huang,et al.  Meta Multi-Task Learning for Sequence Modeling , 2018, AAAI.

[16]  Wei Zhao,et al.  Multitask Learning for Cross-Domain Image Captioning , 2019, IEEE Transactions on Multimedia.

[17]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[18]  Yaohui Jin,et al.  MCapsNet: Capsule Network for Text with Multi-Task Learning , 2018, EMNLP.

[19]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[20]  Yang Jin,et al.  Capsule Network Performance on Complex Data , 2017, ArXiv.

[21]  Xiaodong Liu,et al.  Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval , 2015, NAACL.

[22]  Xuanjing Huang,et al.  Sentence Modeling with Gated Recursive Neural Network , 2015, EMNLP.

[23]  Yunming Ye,et al.  Cross-Domain Sentiment Classification by Capsule Network With Semantic Rules , 2018, IEEE Access.

[24]  Geoffrey E. Hinton,et al.  Matrix capsules with EM routing , 2018, ICLR.

[25]  Wei Zhao,et al.  A Multi-task Learning Approach for Image Captioning , 2018, IJCAI.

[26]  Yaohui Jin,et al.  A Generalized Recurrent Neural Architecture for Text Classification with Multi-Task Learning , 2017, IJCAI.

[27]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[28]  David E. Booth,et al.  Applied Multivariate Analysis , 2003, Technometrics.

[29]  Wei Zhang,et al.  Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction , 2018, EMNLP.

[30]  Samuel Kaski,et al.  Discriminative components of data , 2005, IEEE Transactions on Neural Networks.

[31]  Min Yang,et al.  Investigating Capsule Networks with Dynamic Routing for Text Classification , 2018, EMNLP.

[32]  Shu-Tao Xia,et al.  Bernoulli Random Forests: Closing the Gap between Theoretical Consistency and Empirical Soundness , 2016, IJCAI.