Multi-task Learning with Bidirectional Language Models for Text Classification

Multi-task learning is an effective approach to extract task-invariant features by leveraging potential information among related tasks, which improves the performance of a single task. Most existing work simply divides the whole model into shared and private spaces. Unfortunately, there is no explicit mechanism to prevent the two spaces from merging information from each other. As a result, the shared space may be mixed with task-specific features, while the private space may extract some task-invariant features. To alleviate the problem mentioned, in this paper, we propose a bidirectional language models based multi-task learning method for text classification. More specifically, we add language modelling as an auxiliary task to the private part, aiming to enhance its ability to extract task-specific features. In addition, to promote the shared part to learn common features, a loss constraint via uniform label distribution is introduced to the shared part. Finally, put task-specific features and taskinvariant features together in a weighted addition way to form the final representation, and it is then fed to the corresponding softmax layer. We do experiments on the FDU-MTL dataset which consists of 16 different text classification tasks. The experimental results show that our approach outperforms other typical methods.

[1]  Xuanjing Huang,et al.  Adversarial Multi-task Learning for Text Classification , 2017, ACL.

[2]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[3]  Xuanjing Huang,et al.  Recurrent Neural Network for Text Classification with Multi-Task Learning , 2016, IJCAI.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Fangzhao Wu,et al.  Collaborative Multi-domain Sentiment Classification , 2015, 2015 IEEE International Conference on Data Mining.

[6]  Marek Rei,et al.  Semi-supervised Multitask Learning for Sequence Labeling , 2017, ACL.

[7]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Gholamreza Haffari,et al.  Document Context Neural Machine Translation with Memory Networks , 2017, ACL.

[9]  Xuanjing Huang,et al.  Deep Multi-Task Learning with Shared Memory for Text Classification , 2016, EMNLP.

[10]  Quoc V. Le,et al.  Multi-task Sequence to Sequence Learning , 2015, ICLR.

[11]  Xiaoou Tang,et al.  Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[12]  George Trigeorgis,et al.  Domain Separation Networks , 2016, NIPS.

[13]  Jasha Droppo,et al.  Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Shinji Watanabe,et al.  Joint CTC-attention based end-to-end speech recognition using multi-task learning , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Martial Hebert,et al.  Cross-Stitch Networks for Multi-task Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Claire Cardie,et al.  Multinomial Adversarial Networks for Multi-Domain Text Classification , 2018, NAACL.

[17]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[18]  Dongyeop Kang,et al.  AdvEntuRe: Adversarial Training for Textual Entailment with Knowledge-Guided Examples , 2018, ACL.

[19]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[20]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[21]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[22]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[23]  Yaohui Jin,et al.  A Generalized Recurrent Neural Architecture for Text Classification with Multi-Task Learning , 2017, IJCAI.

[24]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[25]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[26]  Dan Roth,et al.  End-Task Oriented Textual Entailment via Deep Explorations of Inter-Sentence Interactions , 2018, ACL.

[27]  Xiaodong Liu,et al.  Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval , 2015, NAACL.

[28]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.