Cluster-Former: Clustering-based Sparse Transformer for Question Answering
暂无分享,去创建一个
Zhe Gan | Jingjing Liu | Shuohang Wang | Yu Cheng | Siqi Sun | Luowei Zhou | Yuwei Fang | Yen-Chun Chen | Zhe Gan | Shuohang Wang | Jingjing Liu | Yu Cheng | Siqi Sun | Yuwei Fang | Luowei Zhou | Yen-Chun Chen
[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[2] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[3] Jason Weston,et al. Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.
[4] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[5] Kyunghyun Cho,et al. SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine , 2017, ArXiv.
[6] Eunsol Choi,et al. Coarse-to-Fine Question Answering for Long Documents , 2016, ACL.
[7] William W. Cohen,et al. Quasar: Datasets for Question Answering by Search and Reading , 2017, ArXiv.
[8] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[9] Zhiyuan Liu,et al. Denoising Distantly Supervised Open-Domain Question Answering , 2018, ACL.
[10] Jian Su,et al. Densely Connected Attention Propagation for Reading Comprehension , 2018, NeurIPS.
[11] Wei Zhang,et al. R3: Reinforced Ranker-Reader for Open-Domain Question Answering , 2018, AAAI.
[12] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[13] Ramesh Nallapati,et al. Multi-passage BERT: A Globally Normalized BERT Model for Open-domain Question Answering , 2019, EMNLP.
[14] Ming-Wei Chang,et al. Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.
[15] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[16] Yee Whye Teh,et al. Set Transformer , 2018, ICML.
[17] Junru Zhou,et al. Head-Driven Phrase Structure Grammar Parsing on Penn Treebank , 2019, ACL.
[18] Kenton Lee,et al. A BERT Baseline for the Natural Questions , 2019, ArXiv.
[19] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[20] M. Zaheer,et al. Big Bird: Transformers for Longer Sequences , 2020, NeurIPS.
[21] Lukasz Kaiser,et al. Reformer: The Efficient Transformer , 2020, ICLR.
[22] Lucy J. Colwell,et al. Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers , 2020, ArXiv.
[23] Liu Yang,et al. Sparse Sinkhorn Attention , 2020, ICML.
[24] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.
[25] Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering , 2019, ICLR.
[26] Nikolaos Pappas,et al. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention , 2020, ICML.
[27] Arman Cohan,et al. Longformer: The Long-Document Transformer , 2020, ArXiv.
[28] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[29] Jiancheng Lv,et al. RikiNet: Reading Wikipedia Pages for Natural Question Answering , 2020, ACL.
[30] Santiago Ontañón,et al. ETC: Encoding Long and Structured Data in Transformers , 2020, ArXiv.
[31] Franccois Fleuret,et al. Fast Transformers with Clustered Attention , 2020, NeurIPS.
[32] Timothy P. Lillicrap,et al. Compressive Transformers for Long-Range Sequence Modelling , 2019, ICLR.
[33] Aurko Roy,et al. Efficient Content-Based Sparse Attention with Routing Transformers , 2020, TACL.
[34] Yi Tay,et al. Synthesizer: Rethinking Self-Attention for Transformer Models , 2020, ICML.
[35] Edouard Grave,et al. Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering , 2020, EACL.
[36] Yi Tay,et al. Efficient Transformers: A Survey , 2020, ACM Comput. Surv..