Usr-mtl: an unsupervised sentence representation learning framework with multi-task learning

Developing the utilized intelligent systems is increasingly important to learn effective text representations, especially extract the sentence features. Numerous previous studies have been concentrated on the task of sentence representation learning based on deep learning approaches. However, the present approaches are mostly proposed with the single task or replied on the labeled corpus when learning the embedding of the sentences. In this paper, we assess the factors in learning sentence representation and propose an efficient unsupervised learning framework with multi-task learning (USR-MTL), in which various text learning tasks are merged into the unitized framework. With the syntactic and semantic features of sentences, three different factors to some extent are reflected in the task of the sentence representation learning that is the wording, or the ordering of the neighbored sentences of a target sentence in other words. Hence, we integrate the word-order learning task, word prediction task, and the sentence-order learning task into the proposed framework to attain meaningful sentence embeddings. Here, the process of sentence embedding learning is reformulated as a multi-task learning framework of the sentence-level task and the two word-level tasks. Moreover, the proposed framework is motivated by an unsupervised learning algorithm utilizing the unlabeled corpus. Based on the experimental results, our approach achieves the state-of-the-art performances on the downstream natural language processing tasks compared to the popular unsupervised representation learning techniques. The experiments on representation visualization and task analysis demonstrate the effectiveness of the tasks in the proposed framework in creating reasonable sentence representations proving the capacity of the proposed unsupervised multi-task framework for the sentence representation learning.

[1]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[2]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[3]  Lior Wolf,et al.  Associating neural word embeddings with deep image representations using Fisher Vectors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Yu Zhang,et al.  Recurrent Attentional Topic Model , 2017, AAAI.

[5]  Gavriel Salomon,et al.  T RANSFER OF LEARNING , 1992 .

[6]  Felix Hill,et al.  Learning Distributed Representations of Sentences from Unlabelled Data , 2016, NAACL.

[7]  Yu Zhang,et al.  Bi-Directional Recurrent Attentional Topic Model , 2020, ACM Trans. Knowl. Discov. Data.

[8]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[9]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[10]  Wei Li,et al.  Learning Universal Sentence Representations with Mean-Max Attention Autoencoder , 2018, EMNLP.

[11]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[12]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[13]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[14]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[15]  Nan Hua,et al.  Universal Sentence Encoder for English , 2018, EMNLP.

[16]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[17]  Yoshua Bengio,et al.  Learning to Understand Phrases by Embedding the Dictionary , 2015, TACL.

[18]  Hamido Fujita,et al.  Multi-view manifold regularized learning-based method for prioritizing candidate disease miRNAs , 2019, Knowl. Based Syst..

[19]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Novelty Track. , 2005 .

[20]  Qiang Yang,et al.  Multitask Learning for Protein Subcellular Location Prediction , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21]  Camille Pradel,et al.  Mining Discourse Markers for Unsupervised Sentence Representation Learning , 2019, NAACL.

[22]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[23]  Sanja Fidler,et al.  Order-Embeddings of Images and Language , 2015, ICLR.

[24]  Sanja Fidler,et al.  Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Rong Pan,et al.  Adaptive Probabilistic Word Embedding , 2020, WWW.

[26]  J. Désidéri Multiple-gradient descent algorithm (MGDA) for multiobjective optimization , 2012 .

[27]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[28]  Chris Quirk,et al.  Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources , 2004, COLING.

[29]  Wei Xu,et al.  Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.

[30]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[31]  Tengfei Liu,et al.  Recurrent networks with attention and convolutional networks for sentence representation and classification , 2018, Applied Intelligence.

[32]  Christopher Joseph Pal,et al.  Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning , 2018, ICLR.

[33]  Xiaodong Liu,et al.  Multi-Task Deep Neural Networks for Natural Language Understanding , 2019, ACL.

[34]  Claire Cardie,et al.  SemEval-2014 Task 10: Multilingual Semantic Textual Similarity , 2014, *SEMEVAL.

[35]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[36]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[37]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[38]  Hamido Fujita,et al.  A study of graph-based system for multi-view clustering , 2019, Knowl. Based Syst..

[39]  Zhe Gan,et al.  Learning Generic Sentence Representations Using Convolutional Neural Networks , 2016, EMNLP.

[40]  Yi Lin,et al.  Enhancing sentence embedding with dynamic interaction , 2019, Applied Intelligence.

[41]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[42]  Jonathan Weese,et al.  UMBC_EBIQUITY-CORE: Semantic Textual Similarity Systems , 2013, *SEMEVAL.

[43]  Douwe Kiela,et al.  SentEval: An Evaluation Toolkit for Universal Sentence Representations , 2018, LREC.

[44]  Zhang Yi,et al.  A multitask multiview clustering algorithm in heterogeneous situations based on LLE and LE , 2019, Knowl. Based Syst..

[45]  Marco Marelli,et al.  A SICK cure for the evaluation of compositional distributional semantic models , 2014, LREC.

[46]  Vladlen Koltun,et al.  Multi-Task Learning as Multi-Objective Optimization , 2018, NeurIPS.