Information Aggregation for Multi-Head Attention with Routing-by-Agreement
暂无分享,去创建一个
Michael R. Lyu | Jian Li | Xing Wang | Zi-Yi Dou | Zhaopeng Tu | Baosong Yang | Zhaopeng Tu | Zi-Yi Dou | Jian Li | Baosong Yang | Xing Wang
[1] Geoffrey E. Hinton,et al. Dynamic Routing Between Capsules , 2017, NIPS.
[2] Geoffrey E. Hinton,et al. Transforming Auto-Encoders , 2011, ICANN.
[3] Xuanjing Huang,et al. Information Aggregation via Dynamic Routing for Sequence Encoding , 2018, COLING.
[4] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[5] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[6] Andrew McCallum,et al. Linguistically-Informed Self-Attention for Semantic Role Labeling , 2018, EMNLP.
[7] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[8] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[9] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[10] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[11] Min Yang,et al. Investigating Capsule Networks with Dynamic Routing for Text Classification , 2018, EMNLP.
[12] Guillaume Lample,et al. What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.
[13] Philipp Koehn,et al. Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.
[14] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[15] Shuming Shi,et al. Dynamic Layer Aggregation for Neural Machine Translation with Routing-by-Agreement , 2019, AAAI.
[16] Rico Sennrich,et al. Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures , 2018, EMNLP.
[17] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[18] Zhaopeng Tu,et al. Convolutional Self-Attention Networks , 2019, NAACL.
[19] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[20] Richard Socher,et al. Weighted Transformer Network for Machine Translation , 2017, ArXiv.
[21] Xing Shi,et al. Does String-Based Neural MT Learn Source Syntax? , 2016, EMNLP.
[22] Tao Shen,et al. DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding , 2017, AAAI.
[23] Shuming Shi,et al. Exploiting Deep Representations for Neural Machine Translation , 2018, EMNLP.
[24] Ulas Bagci,et al. Capsules for Object Segmentation , 2018, ArXiv.
[25] Lijun Wu,et al. Achieving Human Parity on Automatic Chinese to English News Translation , 2018, ArXiv.
[26] Bowen Zhou,et al. A Structured Self-attentive Sentence Embedding , 2017, ICLR.
[27] Matthieu Cord,et al. MUTAN: Multimodal Tucker Fusion for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[28] Lukasz Kaiser,et al. Universal Transformers , 2018, ICLR.
[29] Geoffrey E. Hinton,et al. Matrix capsules with EM routing , 2018, ICLR.
[30] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[31] Jian Li,et al. Multi-Head Attention with Disagreement Regularization , 2018, EMNLP.
[32] Yang Jin,et al. Capsule Network Performance on Complex Data , 2017, ArXiv.
[33] Jörg Tiedemann,et al. An Analysis of Encoder Representations in Transformer-Based Machine Translation , 2018, BlackboxNLP@EMNLP.
[34] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[35] Tobias Domhan,et al. How Much Attention Do You Need? A Granular Analysis of Neural Machine Translation Architectures , 2018, ACL.
[36] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.