Jumper: Learning When to Make Classification Decision in Reading

In early years, text classification is typically accomplished by feature-based machine learning models; recently, deep neural networks, as a powerful learning machine, make it possible to work with raw input as the text stands. However, exiting end-to-end neural networks lack explicit interpretation of the prediction. In this paper, we propose a novel framework, JUMPER, inspired by the cognitive process of text reading, that models text classification as a sequential decision process. Basically, JUMPER is a neural system that scans a piece of text sequentially and makes classification decisions at the time it wishes. Both the classification result and when to make the classification are part of the decision process, which is controlled by a policy network and trained with reinforcement learning. Experimental results show that a properly trained JUMPER has the following properties: (1) It can make decisions whenever the evidence is enough, therefore reducing total text reading by 30-40% and often finding the key rationale of prediction. (2) It achieves classification accuracy better than or comparable to state-of-the-art models in several benchmark and industrial datasets.

[1]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[2]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[3]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[4]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[5]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[6]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[7]  Zhengdong Lu,et al.  Object-oriented Neural Programming (OONP) for Document Understanding , 2017, ACL.

[8]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[9]  Ye Zhang,et al.  Rationale-Augmented Convolutional Neural Networks for Text Classification , 2016, EMNLP.

[10]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[11]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[12]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[13]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[14]  Gary Marcus,et al.  Deep Learning: A Critical Appraisal , 2018, ArXiv.

[15]  Hang Li,et al.  Coupling Distributed and Symbolic Execution for Natural Language Queries , 2016, ICML.

[16]  David Vandyke,et al.  A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Quoc V. Le,et al.  Learning to Skim Text , 2017, ACL.