论文信息 - Pre-train, Interact, Fine-tune: A Novel Interaction Representation for Text Classification - 字舞流文

Pre-train, Interact, Fine-tune: A Novel Interaction Representation for Text Classification

Text representation can aid machines in understanding text. Previous work on text representation often focuses on the so-called forward implication, i.e., preceding words are taken as the context of later words for creating representations, thus ignoring the fact that the semantics of a text segment is a product of the mutual implication of words in the text: later words contribute to the meaning of preceding words. We introduce the concept of interaction and propose a two-perspective interaction representation, that encapsulates a local and a global interaction representation. Here, a local interaction representation is one that interacts among words with parent-children relationships on the syntactic trees and a global interaction interpretation is one that interacts among all the words in a sentence. We combine the two interaction representations to develop a Hybrid Interaction Representation (HIR). Inspired by existing feature-based and fine-tuning-based pretrain-finetuning approaches to language models, we integrate the advantages of feature-based and fine-tuning-based methods to propose the Pre-train, Interact, Fine-tune (PIF) architecture. We evaluate our proposed models on five widely-used datasets for text classification tasks. Our ensemble method, outperforms state-of-the-art baselines with improvements ranging from 2.03% to 3.15% in terms of error rate. In addition, we find that, the improvements of PIF against most state-of-the-art methods is not affected by increasing of the length of the text.

M. de Rijke | Fei Cai | Honghui Chen | Maarten de Rijke | Jianming Zheng | Fei Cai | Honghui Chen | Jianming Zheng

[1] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[2] Prasenjit Majumder,et al. Effective aggregation of various summarization techniques , 2018, Inf. Process. Manag..

[3] Hwee Tou Ng,et al. Exploiting Document Knowledge for Aspect-level Sentiment Classification , 2018, ACL.

[4] Peng Zhou,et al. Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling , 2016, COLING.

[5] Harith Alani,et al. Contextual semantics for sentiment analysis of Twitter , 2016, Inf. Process. Manag..

[6] M. de Rijke,et al. Leveraging Contextual Sentence Relations for Extractive Summarization Using a Neural Attention Model , 2017, SIGIR.

[7] Ellen M. Voorhees,et al. The TREC-8 Question Answering Track Evaluation , 2000, TREC.

[8] Qinmin Hu,et al. Enhancing Recurrent Neural Networks with Positional Attention for Question Answering , 2017, SIGIR.

[9] Mihai Surdeanu,et al. The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[10] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[11] Honghui Chen,et al. Hierarchical Neural Representation for Document Classification , 2018, Cognitive Computation.

[12] So Young Sohn,et al. Term discrimination for text search tasks derived from negative binomial distribution , 2018, Inf. Process. Manag..

[13] Stephen E. Robertson,et al. Understanding inverse document frequency: on theoretical arguments for IDF , 2004, J. Documentation.

[14] Tomas Mikolov,et al. Bag of Tricks for Efficient Text Classification , 2016, EACL.

[15] M. de Rijke,et al. Siamese CBOW: Optimizing Word Embeddings for Sentence Representations , 2016, ACL.

[16] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[17] Quoc V. Le,et al. Semi-supervised Sequence Learning , 2015, NIPS.

[18] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[19] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[20] Jun Zhao,et al. Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[21] Ricardo da Silva Torres,et al. A multimodal query expansion based on genetic programming for visually-oriented e-commerce applications , 2016, Inf. Process. Manag..

[22] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.

[23] Jakob Uszkoreit,et al. A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[24] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[25] Christopher Potts,et al. Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[26] Richard Socher,et al. Learned in Translation: Contextualized Word Vectors , 2017, NIPS.

[27] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[28] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[29] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[30] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[31] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[32] Matt Post,et al. Explicit and Implicit Syntactic Features for Text Classification , 2013, ACL.

[33] Raymond Y. K. Lau,et al. Incorporating sentiment into tag-based user profiles and resource profiles for personalized search in folksonomy , 2016, Inf. Process. Manag..

[34] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[35] David D. Lewis,et al. An evaluation of phrasal and clustered representations on a text categorization task , 1992, SIGIR '92.

[36] Xiang Zhang,et al. Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[37] Xuanjing Huang,et al. Recurrent Neural Network for Text Classification with Multi-Task Learning , 2016, IJCAI.

[38] Tong Zhang,et al. Deep Pyramid Convolutional Neural Networks for Text Categorization , 2017, ACL.

[39] Ting Liu,et al. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[40] Christopher D. Manning,et al. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[41] Richard Socher,et al. Regularizing and Optimizing LSTM Language Models , 2017, ICLR.

[42] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[43] Marc'Aurelio Ranzato,et al. Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[44] Xipeng Qiu,et al. Recurrent Neural Network for Text Classification with MultiTask Learning , 2016 .