暂无分享,去创建一个
Erich Elsen | Arthur Mensch | Simon Osindero | Oriol Vinyals | Karen Simonyan | Albin Cassirer | Michela Paganini | Laurent Sifre | Jacob Menick | Aidan Clark | Diego de Las Casas | Jordan Hoffmann | Jean-Baptiste Lespiau | Jack W. Rae | Sebastian Borgeaud | Aurelia Guy | Roman Ring | George van den Driessche | Geoffrey Irving | Trevor Cai | Eliza Rutherford | Katie Millican | Bogdan Damoc | Tom Hennigan | Saffron Huang | Loren Maggiore | Chris Jones | Andy Brock
[1] Edouard Grave,et al. Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering , 2020, EACL.
[2] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[3] Sebastian Riedel,et al. Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets , 2020, EACL.
[4] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[5] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.
[6] Mohammad Shoeybi,et al. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism , 2019, ArXiv.
[7] Jason Weston,et al. Retrieval Augmentation Reduces Hallucination in Conversation , 2021, EMNLP.
[8] Yoshua Bengio,et al. A Neural Knowledge Language Model , 2016, ArXiv.
[9] Emily M. Bender,et al. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 , 2021, FAccT.
[10] Ian Goodfellow,et al. Deep Learning with Differential Privacy , 2016, CCS.
[11] Satoshi Nakamura,et al. Guiding Neural Machine Translation with Retrieved Translation Pieces , 2018, NAACL.
[12] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[13] Omer Levy,et al. Generalization through Memorization: Nearest Neighbor Language Models , 2020, ICLR.
[14] Andrew McCallum,et al. Energy and Policy Considerations for Deep Learning in NLP , 2019, ACL.
[15] Sanjiv Kumar,et al. Accelerating Large-Scale Inference with Anisotropic Vector Quantization , 2019, ICML.
[16] Timnit Gebru,et al. Lessons from archives: strategies for collecting sociocultural data in machine learning , 2019, FAT*.
[17] Ming-Wei Chang,et al. Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.
[18] Charles Foster,et al. The Pile: An 800GB Dataset of Diverse Text for Language Modeling , 2020, ArXiv.
[19] Alexei Baevski,et al. Adaptive Input Representations for Neural Language Modeling , 2018, ICLR.
[20] Byron C. Wallace,et al. Attention is not Explanation , 2019, NAACL.
[21] Fabio Petroni,et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , 2020, NeurIPS.
[22] Yejin Choi,et al. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models , 2020, FINDINGS.
[23] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[24] Douglas Eck,et al. Deduplicating Training Data Makes Language Models Better , 2021, ArXiv.
[25] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..
[26] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[27] Dani Yogatama,et al. End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering , 2021, NeurIPS.
[28] Sandro Pezzelle,et al. The LAMBADA dataset: Word prediction requiring a broad discourse context , 2016, ACL.
[29] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[30] Jason Weston,et al. Internet-Augmented Dialogue Generation , 2021, ACL.
[31] Olatunji Ruwase,et al. ZeRO: Memory optimizations Toward Training Trillion Parameter Models , 2020, SC20: International Conference for High Performance Computing, Networking, Storage and Analysis.
[32] Yonatan Belinkov,et al. Interpretability and Analysis in Neural NLP , 2020, ACL.
[33] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[34] Rico Sennrich,et al. Root Mean Square Layer Normalization , 2019, NeurIPS.
[35] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[36] Ming-Wei Chang,et al. Retrieval Augmented Language Model Pre-Training , 2020, ICML.
[37] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[38] W. Bruce Croft,et al. LDA-based document models for ad-hoc retrieval , 2006, SIGIR.
[39] Alec Radford,et al. Scaling Laws for Neural Language Models , 2020, ArXiv.
[40] W. Bruce Croft,et al. Guided Transformer: Leveraging Multiple External Sources for Representation Learning in Conversational Search , 2020, SIGIR.
[41] Phil Blunsom,et al. Pitfalls of Static Language Modelling , 2021, ArXiv.
[42] Thorsten Brants,et al. Large Language Models in Machine Translation , 2007, EMNLP.
[43] Colin Raffel,et al. Extracting Training Data from Large Language Models , 2020, USENIX Security Symposium.
[44] Po-Sen Huang,et al. Ethical and social risks of harm from Language Models , 2021, ArXiv.
[45] Hugo Zaragoza,et al. The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..
[46] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[47] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[48] Ming-Wei Chang,et al. Latent Retrieval for Weakly Supervised Open Domain Question Answering , 2019, ACL.
[49] Alberto Montresor,et al. WikiLinkGraphs: A complete, longitudinal and multi-language dataset of the Wikipedia link networks , 2019, ICWSM.
[50] Danqi Chen,et al. Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.
[51] Nicolas Usunier,et al. Improving Neural Language Models with a Continuous Cache , 2016, ICLR.
[52] Yong Wang,et al. Search Engine Guided Neural Machine Translation , 2018, AAAI.
[53] Nicola De Cao,et al. A Memory Efficient Baseline for Open Domain Question Answering , 2020, ArXiv.
[54] Dani Yogatama,et al. Adaptive Semiparametric Language Models , 2021, Transactions of the Association for Computational Linguistics.