论文信息 - Stochastic Answer Networks for Natural Language Inference - 字舞流文

Stochastic Answer Networks for Natural Language Inference

We propose a stochastic answer network (SAN) to explore multi-step inference strategies in Natural Language Inference. Rather than directly predicting the results given the inputs, the model maintains a state and iteratively refines its predictions. Our experiments show that SAN achieves the state-of-the-art results on three benchmarks: Stanford Natural Language Inference (SNLI) dataset, MultiGenre Natural Language Inference (MultiNLI) dataset and Quora Question Pairs dataset.

Xiaodong Liu | Kevin Duh | Jianfeng Gao | Jianfeng Gao | Kevin Duh | Xiaodong Liu

[1] Xiaodong Liu,et al. An Empirical Analysis of Multiple-Turn Reasoning Strategies in Reading Comprehension Tasks , 2017, IJCNLP.

[2] Xiaodong Liu,et al. Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension , 2018, NAACL.

[3] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.

[4] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.

[5] Peter Clark,et al. SciTaiL: A Textual Entailment Dataset from Science Question Answering , 2018, AAAI.

[6] Siu Cheung Hui,et al. A Compare-Propagate Architecture with Alignment Factorization for Natural Language Inference , 2017, ArXiv.

[7] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.

[8] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .

[9] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[10] Rui Yan,et al. Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[11] Ruslan Salakhutdinov,et al. Gated-Attention Readers for Text Comprehension , 2016, ACL.

[12] Xiaodong Liu,et al. Multi-Task Learning for Machine Reading Comprehension , 2018, ArXiv.

[13] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[15] Zoubin Ghahramani,et al. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[16] Jakob Uszkoreit,et al. Neural Paraphrase Identification of Questions with Noisy Pretraining , 2017, SWCN@EMNLP.

[17] Jason Weston,et al. The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.

[18] Samuel R. Bowman,et al. The RepEval 2017 Shared Task: Multi-Genre Natural Language Inference with Sentence Representations , 2017, RepEval@EMNLP.

[19] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[21] Jian Zhang,et al. Natural Language Inference over Interaction Space , 2017, ICLR.

[22] Richard Socher,et al. Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[23] Xiaodong Liu,et al. Multi-Task Deep Neural Networks for Natural Language Understanding , 2019, ACL.

[24] Zhiguo Wang,et al. Bilateral Multi-Perspective Matching for Natural Language Sentences , 2017, IJCAI.

[25] Xiaodong Liu,et al. Stochastic Answer Networks for Machine Reading Comprehension , 2017, ACL.

[26] Philip Bachman,et al. Iterative Alternating Neural Attention for Machine Reading , 2016, ArXiv.

[27] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[28] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.