CogLTX: Applying BERT to Long Texts
暂无分享,去创建一个
Chang Zhou | Hongxia Yang | Ming Ding | Jie Tang | Jie Tang | Chang Zhou | Hongxia Yang | Ming Ding
[1] Lei Li,et al. Dynamically Fused Graph Network for Multi-hop Reasoning , 2019, ACL.
[2] Yuan Luo,et al. Graph Convolutional Networks for Text Classification , 2018, AAAI.
[3] Arman Cohan,et al. Longformer: The Long-Document Transformer , 2020, ArXiv.
[4] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[5] Shuohang Wang,et al. Machine Comprehension Using Match-LSTM and Answer Pointer , 2016, ICLR.
[6] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[7] Wei Wang,et al. Multi-Granularity Hierarchical Attention Fusion Networks for Reading Comprehension and Question Answering , 2018, ACL.
[8] Dirk Weissenborn,et al. Making Neural QA as Simple as Possible but not Simpler , 2017, CoNLL.
[9] Tomas Mikolov,et al. Bag of Tricks for Efficient Text Classification , 2016, EACL.
[10] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[11] Najim Dehak,et al. Joint Verification-Identification in end-to-end Multi-Scale CNN Framework for Topic Identification , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] M. D’Esposito. Working memory. , 2008, Handbook of clinical neurology.
[13] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[14] Philip Bachman,et al. NewsQA: A Machine Comprehension Dataset , 2016, Rep4NLP@ACL.
[15] Ken Lang,et al. NewsWeeder: Learning to Filter Netnews , 1995, ICML.
[16] Hwee Tou Ng,et al. A Question-Focused Multi-Factor Attention Network for Question Answering , 2018, AAAI.
[17] Gerard Salton,et al. A vector space model for automatic indexing , 1975, CACM.
[18] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[19] Chang Zhou,et al. Cognitive Graph for Multi-Hop Reading Comprehension at Scale , 2019, ACL.
[20] Ramesh Nallapati,et al. Multi-passage BERT: A Globally Normalized BERT Model for Open-domain Question Answering , 2019, EMNLP.
[21] O. Wilhelm,et al. Working memory capacity - facets of a cognitive ability construct , 2000 .
[22] Ming-Wei Chang,et al. Latent Retrieval for Weakly Supervised Open Domain Question Answering , 2019, ACL.
[23] Shakir Mohamed,et al. Variational Inference with Normalizing Flows , 2015, ICML.
[25] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[26] Hannaneh Hajishirzi,et al. Multi-hop Reading Comprehension through Question Decomposition and Rescoring , 2019, ACL.
[27] Jian Su,et al. Densely Connected Attention Propagation for Reading Comprehension , 2018, NeurIPS.
[28] Richard Socher,et al. Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering , 2019, ICLR.
[29] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[30] Yoshua Bengio,et al. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.
[31] 黄瀬 浩一,et al. 読書行動識別のためのSelf-supervised Learningの実験的検討 , 2019 .
[32] Youngja Park,et al. Unsupervised Sentence Embedding Using Document Structure-Based Context , 2019, ECML/PKDD.
[33] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[34] John Brown. Some Tests of the Decay Theory of Immediate Memory , 1958 .
[35] Eunsol Choi,et al. MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension , 2019, MRQA@EMNLP.
[36] Xiao Liu,et al. Self-supervised Learning: Generative or Contrastive , 2020, ArXiv.
[37] Rajarshi Das,et al. Multi-step Retriever-Reader Interaction for Scalable Open-domain Question Answering , 2019, ICLR.
[38] Zhe Gan,et al. Hierarchical Graph Network for Multi-hop Question Answering , 2019, EMNLP.
[39] Ali Farhadi,et al. Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.
[40] T. E. Lange,et al. Below the Surface: Analogical Similarity and Retrieval Competition in Reminding , 1994, Cognitive Psychology.
[41] P. Carpenter,et al. Individual differences in working memory and reading , 1980 .
[42] Edouard Grave,et al. Adaptive Attention Span in Transformers , 2019, ACL.
[43] Masaaki Nagata,et al. Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction , 2019, ACL.
[44] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[45] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[46] Jason Weston,et al. Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.
[47] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[48] G. A. Miller. THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .
[49] Timothy P. Lillicrap,et al. Compressive Transformers for Long-Range Sequence Modelling , 2019, ICLR.
[50] Michael Collins,et al. Discriminative Reranking for Natural Language Parsing , 2000, CL.
[51] P. Barrouillet,et al. Time constraints and resource sharing in adults' working memory spans. , 2004, Journal of experimental psychology. General.
[52] Lukasz Kaiser,et al. Reformer: The Efficient Transformer , 2020, ICLR.
[53] Omer Levy,et al. Blockwise Self-Attention for Long Document Understanding , 2020, EMNLP.
[54] Jes'us Villalba,et al. Hierarchical Transformers for Long Document Classification , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[55] Richard Socher,et al. Efficient and Robust Question Answering from Minimal Context over Documents , 2018, ACL.