The Natural Language Decathlon: Multitask Learning as Question Answering

Presented on August 28, 2018 at 12:15 p.m. in the Pettit Microelectronics Research Center, Room 102 A/B.

[1]  Terry Winograd,et al.  Understanding natural language , 1974 .

[2]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[3]  R Ratcliff,et al.  Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. , 1990, Psychological review.

[4]  Jürgen Schmidhuber,et al.  Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks , 1992, Neural Computation.

[5]  Anthony V. Robins,et al.  Catastrophic Forgetting, Rehearsal and Pseudorehearsal , 1995, Connect. Sci..

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  Sebastian Thrun,et al.  Learning to Learn: Introduction and Overview , 1998, Learning to Learn.

[8]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[9]  Sebastian Thrun,et al.  Lifelong Learning Algorithms , 1998, Learning to Learn.

[10]  Yaroslav Fyodorov,et al.  A Natural Logic Inference System , 2000 .

[11]  Sepp Hochreiter,et al.  Learning to Learn Using Gradient Descent , 2001, ICANN.

[12]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[13]  Daniel G. Bobrow,et al.  Entailment, intensionality and text understanding , 2003, HLT-NAACL 2003.

[14]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[15]  Johan Bos,et al.  Recognising Textual Entailment with Robust Logical Inference , 2005, MLCW.

[16]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[17]  Ricardo Vilalta,et al.  A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[18]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[19]  Tom M. Mitchell,et al.  The Need for Biases in Learning Generalizations , 2007 .

[20]  Yoshua Bengio,et al.  On the Optimization of a Synaptic Learning Rule , 2007 .

[21]  Richard Johansson,et al.  Dependency-based Semantic Role Labeling of PropBank , 2008, EMNLP.

[22]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[23]  Dan Roth,et al.  The Importance of Syntactic Parsing and Inference in Semantic Role Labeling , 2008, CL.

[24]  Christopher D. Manning,et al.  An extended model of natural logic , 2009, IWCS.

[25]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[26]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[27]  Hector J. Levesque,et al.  The Winograd Schema Challenge , 2011, AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning.

[28]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[29]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[30]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[31]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[32]  Kalyanmoy Deb,et al.  Multi-objective Optimization , 2014 .

[33]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[34]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[35]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[36]  Misha Denil,et al.  From Group to Individual Labels Using Deep Features , 2015, KDD.

[37]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[38]  Luke S. Zettlemoyer,et al.  Question-Answer Driven Semantic Role Labeling: Using Natural Language to Annotate Natural Language , 2015, EMNLP.

[39]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[40]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[41]  Wei Xu,et al.  End-to-end learning of semantic role labeling using recurrent neural networks , 2015, ACL.

[42]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[43]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[44]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[45]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[46]  Deniz Yuret,et al.  Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.

[47]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[48]  Marcin Andrychowicz,et al.  Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[49]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[50]  Mauro Cettolo,et al.  The IWSLT 2016 Evaluation Campaign , 2016, IWSLT.

[51]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[52]  Yang Yu,et al.  End-to-End Reading Comprehension with Dynamic Answer Chunk Ranking , 2016, ArXiv.

[53]  Bowen Zhou,et al.  Pointing the Unknown Words , 2016, ACL.

[54]  Alexander Gepperth,et al.  A Bio-Inspired Incremental Learning Architecture for Applied Perceptual Problems , 2016, Cognitive Computation.

[55]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[56]  Quoc V. Le,et al.  Multi-task Sequence to Sequence Learning , 2015, ICLR.

[57]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[58]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[59]  Richard Socher,et al.  Dynamic Memory Networks for Visual and Textual Question Answering , 2016, ICML.

[60]  Rico Sennrich,et al.  The University of Edinburgh’s Neural MT Systems for WMT17 , 2017, WMT.

[61]  David Vandyke,et al.  A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[62]  Byoung-Tak Zhang,et al.  Overcoming Catastrophic Forgetting by Incremental Moment Matching , 2017, NIPS.

[63]  Richard Socher,et al.  Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning , 2018, ArXiv.

[64]  Deng Cai,et al.  MEMEN: Multi-layer Embedding with Memory Networks for Machine Comprehension , 2017, ArXiv.

[65]  Zhenhua Ling,et al.  Natural Language Inference with External Knowledge , 2017, ICLR 2018.

[66]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[67]  Chrisantha Fernando,et al.  PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.

[68]  Masaaki Nagata,et al.  Cutting-off Redundant Repeating Generations for Neural Abstractive Summarization , 2016, EACL.

[69]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[70]  Philip Bachman,et al.  NewsQA: A Machine Comprehension Dataset , 2016, Rep4NLP@ACL.

[71]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[72]  Dirk Weissenborn,et al.  Making Neural QA as Simple as Possible but not Simpler , 2017, CoNLL.

[73]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[74]  Rui Liu,et al.  Phase Conductor on Multi-layered Attentions for Machine Comprehension , 2017, ArXiv.

[75]  Ilya Sutskever,et al.  Learning to Generate Reviews and Discovering Sentiment , 2017, ArXiv.

[76]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[77]  Hong Yu,et al.  Neural Tree Indexers for Text Understanding , 2016, EACL.

[78]  Diego Marcheggiani,et al.  A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling , 2017, CoNLL.

[79]  Quoc V. Le,et al.  Unsupervised Pretraining for Sequence to Sequence Learning , 2016, EMNLP.

[80]  Lukasz Kaiser,et al.  One Model To Learn Them All , 2017, ArXiv.

[81]  Ramakanth Pasunuru,et al.  Towards Improving Abstractive Summarization via Entailment Generation , 2017, NFiS@EMNLP.

[82]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[83]  Luke S. Zettlemoyer,et al.  Deep Semantic Role Labeling: What Works and What’s Next , 2017, ACL.

[84]  Zhiguo Wang,et al.  Bilateral Multi-Perspective Matching for Natural Language Sentences , 2017, IJCAI.

[85]  Yoshimasa Tsuruoka,et al.  A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks , 2016, EMNLP.

[86]  Ming Zhou,et al.  Gated Self-Matching Networks for Reading Comprehension and Question Answering , 2017, ACL.

[87]  Hannaneh Hajishirzi,et al.  Question Answering through Transfer Learning from Large Fine-grained Supervision Data , 2017, ACL.

[88]  Richard Socher,et al.  Learned in Translation: Contextualized Word Vectors , 2017, NIPS.

[89]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[90]  Hong Yu,et al.  Neural Semantic Encoders , 2016, EACL.

[91]  Sungzoon Cho,et al.  Distance-based Self-Attention Network for Natural Language Inference , 2017, ArXiv.

[92]  Tsung-Hsien Wen,et al.  Neural Belief Tracker: Data-Driven Dialogue State Tracking , 2016, ACL.

[93]  Joachim Bingel,et al.  Sluice networks: Learning what to share between loosely related tasks , 2017, ArXiv.

[94]  Richard Socher,et al.  Dynamic Coattention Networks For Question Answering , 2016, ICLR.

[95]  Richard Socher,et al.  Pointer Sentinel Mixture Models , 2016, ICLR.

[96]  Eunsol Choi,et al.  TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[97]  Omer Levy,et al.  Zero-Shot Relation Extraction via Reading Comprehension , 2017, CoNLL.

[98]  Shuohang Wang,et al.  Machine Comprehension Using Match-LSTM and Answer Pointer , 2016, ICLR.

[99]  Jihun Choi,et al.  Learning to Compose Task-Specific Tree Structures , 2017, AAAI.

[100]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[101]  Richard Socher,et al.  DCN+: Mixed Objective and Deep Residual Coattention for Question Answering , 2017, ICLR.

[102]  Xiaoli Z. Fern,et al.  DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference , 2018, NAACL.

[103]  Richard Socher,et al.  Global-Locally Self-Attentive Encoder for Dialogue State Tracking , 2018, ACL.

[104]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[105]  Ramakanth Pasunuru,et al.  Multi-Reward Reinforced Summarization with Saliency and Entailment , 2018, NAACL.

[106]  Svetlana Lazebnik,et al.  PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[107]  Mor Naaman,et al.  Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies , 2018, NAACL.

[108]  Ronald Kemker,et al.  Measuring Catastrophic Forgetting in Neural Networks , 2017, AAAI.

[109]  Chengqi Zhang,et al.  Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling , 2018, IJCAI.

[110]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[111]  Yonatan Belinkov,et al.  On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference , 2018, NAACL.

[112]  Yidong Chen,et al.  Deep Semantic Role Labeling with Self-Attention , 2017, AAAI.

[113]  Rachel Rudinger,et al.  Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation , 2018, BlackboxNLP@EMNLP.

[114]  Jonathan Berant,et al.  Contextualized Word Representations for Reading Comprehension , 2017, NAACL.

[115]  Xiaodong Liu,et al.  Stochastic Answer Networks for Machine Reading Comprehension , 2017, ACL.

[116]  Quoc V. Le,et al.  QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension , 2018, ICLR.

[117]  Omer Levy,et al.  GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.

[118]  Po-Sen Huang,et al.  Natural Language to Structured Query Generation via Meta-Learning , 2018, NAACL.

[119]  Sebastian Ruder,et al.  Fine-tuned Language Models for Text Classification , 2018, ArXiv.

[120]  Ankur Bapna,et al.  The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation , 2018, ACL.

[121]  Yelong Shen,et al.  FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension , 2017, ICLR.

[122]  Rachel Rudinger,et al.  Towards a Unified Natural Language Inference Framework to Evaluate Sentence Representations , 2018, ArXiv.

[123]  Tao Yu,et al.  TypeSQL: Knowledge-Based Type-Aware Neural Text-to-SQL Generation , 2018, NAACL.

[124]  Ming Zhou,et al.  Reinforced Mnemonic Reader for Machine Reading Comprehension , 2017, IJCAI.

[125]  Siu Cheung Hui,et al.  A Compare-Propagate Architecture with Alignment Factorization for Natural Language Inference , 2017, ArXiv.

[126]  Mirella Lapata,et al.  Coarse-to-Fine Decoding for Neural Semantic Parsing , 2018, ACL.

[127]  Jin-Hyuk Hong,et al.  Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information , 2018, AAAI.