BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
暂无分享,去创建一个
Alexander M. Rush | Dragomir R. Radev | Ona de Gibert | Stephen H. Bach | David Ifeoluwa Adelani | Alham Fikri Aji | Hyung Won Chung | Genta Indra Winata | Mike Tian-Jian Jiang | Daniel H Garrette | Tiago Timponi Torrent | M Saiful Bari | Zheng Xin Yong | Chris C. Emezue | Teven Le Scao | Jason Alan Fries | Patrick von Platen | Leandro von Werra | Nihal V. Nayak | Oskar van der Wal | Shamsuddeen Hassan Muhammad | Stella Rose Biderman | Javier de la Rosa | Carlos Muñoz Ferrandis | Maged S. Al-shaibani | Abhinav Ramesh Kashyap | Julio Bonis Sanz | Eduardo G. Ponferrada | Sabrina J. Mielke | Pawan Sasanka Ammanamanchi | Pedro Ortiz Suarez | Albert Villanova del Moral | Samyam Rajbhandari | Jeff Rasley | Olatunji Ruwase | M. Shoeybi | J. Casper | Iz Beltagy | Kyle Lo | D. Narayanan | Colin Raffel | Jesse Dodge | Yacine Jernite | Ofir Press | Angela Fan | Margaret Mitchell | Danish Contractor | Minjia Zhang | Aitor Soroa Etxabe | Max Ryabinin | Irene Solaiman | Adam Roberts | Sebastian Gehrmann | Urmish Thakker | Benoît Sagot | Gully A. Burns | Ehud Reiter | Thomas Wolf | Germán Kruszewski | Veronika Laippala | Sampo Pyysalo | Marine Carpuat | Benjamin Heinzerling | D. Tunuguntla | Antonio Miranda-Escalada | A. Callahan | Dian Yu | Hendrik Strobelt | M. Samwald | Pascale Fung | Jungo Kasai | Itziar Gonzalez-Dios | Michael McKenna | Sheng Shen | Jonathan Chang | Nazneen Rajani | Conglong Li | Isaac Johnson | Thibault Févry | Nora Kassner | Anna Rogers | Chenglei Si | Elizabeth Salesky | Verena Rieser | Jekaterina Novikova | Franccois Yvon | Rachel Bawden | Tristan Thrush | Julien Launay | Christopher Klamm | Aaron Gokaslan | Simon Ott | Tatiana Shavrina | B. Ajibade | Matteo Manica | Najoung Kim | Taewoon Kim | Douwe Kiela | Niklas Muennighoff | Nafis Abrar | J. Forde | Zhiqing Sun | Vikas Raunak | Anne-Laure Ligozat | Jian Zhu | S. Longpre | Newton Cheng | Azadeh HajiHosseini | Antoine Chaffin | Thomas Scialom | Sourav Roy | Shaden Smith | Vassilina Nikoulina | S. Viguier | Gunjan Chhablani | N. Muellner | A. Feizpour | Myriam Peyrounette | V. Danchev | Maximin Coavoux | Mayank Singh | Debajyoti Datta | J. Golde | R. L'opez | Luisa Shinzato | Alice Rueda | J. Bhattacharjee | Edward Tan | Olivier Nguyen | Matthias Gallé | Zifan Ye | N. Dahlberg | Arjun Subramonian | R. Lacroix | Clémentine Fourrier | I. Nejadgholi | Lu Liu | Yanis Labrak | Minna Liu | Albert Webson | D. Lansky | John Giorgi | Canwen Xu | Samuel Albanie | Wojciech Kusa | Harshit Pandey | Daniel Hesslow | S. Alizadeh | Victor Sanh | Zaid Alyafeai | Arnaud Stiegler | Arun Raja | Manan Dey | Shanya Sharma | Eliza Szczechla | Han Wang | Thomas Wang | Trishala Neeraj | Jos Rozen | Abheesht Sharma | Andrea Santilli | Ryan Teehan | Leo Gao | T. Bers | Rui Zhang | Leon Weber | R. Ribeiro | Jason Phang | Jordan Clive | Peter Henderson | Nishant Subramani | A. Luccioni | R. Kromann | Pierre Colombo | Srishti Kumar | L. Tanguy | Samuel Cahyawijaya | Jenny Chim | Ken Kawamura | Mustafa Ghaleb | V. Mikhailov | Myungsun Kang | Idris Abdulmumin | Hady ElSahar | Colin Leong | Hieu Tran | Fatim T Mirza | Indrani Bhattacharya | Stefan Schweter | Jorg Frohberg | Tim Dettmers | Ahmed Baruwa | Joshua Seltzer | Elizabeth-Jane Pavlick | Huu Nguyen | Maraim Masoud | Samson Tan | Gérard Dupont | Zeerak Talat | Somaieh Nikpoor | Rishi Bommasani | Christopher Akiki | Karthi Sivaraman | Yada Pruksachatkun | A. Tammour | Yonatan Belinkov | F. Toni | Enrique Manjavacas | Daniel Alexander van Strien | Natasha Seelam | Gabriel Altay | Ruisi Su | Samuele Garda | Bo Wang | Fabio Barth | Mario Sanger | Daniel Le'on Perin'an | Th'eo Gigant | J. Posada | Marc Pàmies | Marianna Nezhurina | Robert Martin | Michael Cullan | Shamik Bose | Shlok S Deshmukh | Sid Kiblawi | Benjamin Beilharz | Hugo Laurenccon | Ethan Kim | Timo Schick | Paulo Villegas | Jaesung Tae | Quentin Lhoest | Lucile Saulnier | Davis David | Salomey Osei | Nurulaqilla Khamis | Chenxi Zhou | Habib Rezanejad | J. Tow | Charles Lovering | Jan-Christoph Kalo | S. Zink | Amit Alfassy | Michael Weinberg | Long Phan | Angelina McMillan-Major | Mayank Mishra | T. A. Laud | Wilson Y. Lee | M. Muñoz | Tomasz Limisiewicz | Eli Bogdanov | Sanchit Gandhi | Ying Xu | Ekaterina Taktasheva | Oleg Serikov | V. Protasov | E. Voloshina | Adi Simhi | Hailey Schoelkopf | Omer Antverg | Lintang Sutawika | Y. Venkatraman | M. Freidank | Y. Uri | B. Saxena | Silas L. Wang | S. Pais | Suzana Ili'c | Roman Castagn'e | Stas Bekman | Ariel Kreisberg Nitzav | Chenghao Mou | Efrat Levkovizh | E. Natan | Giada Pistilli | Hamza Benyamina | Ian Yu | Josephine L. Tobing | Khalid Almubarak | Kimbo Chen | Mar'ia Grandury | Mario vSavsko | Max Huang | Minh Chien Vu | M. A. Jauhar | Omar Espejel | Priscilla Amuok | Rheza Harliman | Sebastian Nagel | Stanislav Silberberg | S. Pai | Violette Lepercq | V. Prabhu | Srulik Ben-David | Xiang Tang | Shaked Brody | Hadar Tojarieh | Hatim Bourfoune | N. Patry | Nouamane Tazi | Omar Sanseviero | Pierre Cornette | Pierre Franccois Lavall'ee | S. Requena | Suraj Patil | Anastasia Cheveleva | Aur'elie N'ev'eol | Liam Hazan | Miruna Clinciu | Tian Yun | Zachary Bamberger | Zdenvek Kasner | Amanda Pestana | Ammar Khan | Amy Faranak | A. Santos | A. Hevia | Antigona Unldreaj | Arash Aghagol | Arezoo Abdollahi | Bahareh Behroozi | D. A. Nguyen | Emily Baylor | Ezinwanne Ozoani | Frankline Ononiwu | H.A. Jones | Irina Sedenko | J. Passmore | L. Dutra | Mairon Samagaio | Maraim Elbadri | Marissa Gerchick | Martha Akinlolu | Mike Qiu | M. Ghauri | Mykola Burynok | Nour Elkott | N. Fahmy | O. Samuel | Ran An | Ryan Hao | Sarmad Shubber | Thanh-Cong Le | Tobi Oyebade | T. Le | Yoyo Yang | Z. Nguyen | Alfredo Palasciano | Anima Shukla | A. Singh | C. Brito | Chirag Jain | Chuxin Xu | Daniel Molano | Florian Fuhrimann | Giyaseddin Bayrak | Helena U. Vrabec | I. Bello | Isha Dash | J. Kang | Lokesh Bulchandani | Madeleine Hahn de Bykhovetz | Maiko Takeuchi | M. A. Castillo | M. Wolf | Mina Mihaljcic | N. Broad | Patricia Haller | R. Chandrasekhar | R. Eisenberg | Rodrigo L. Canalli | Rosaline Su | Shubhanshu Mishra | Sinee Sang-aroonsiri | S. Bharati | Tomoya Kainuma | Yashasvi Bajaj | Yifan Xu | Z. Tan | Zhongli Xie | M. Bras | Younes Belkada | Loubna Ben Allal | A. Singh | Ruochen Zhang | Karen Fort | M. Mieskes | Yun-chao Xu | Rui Ribeiro | Amanpreet Singh
[1] Khaled Kamal Saab,et al. Hungry Hungry Hippos: Towards Language Modeling with State Space Models , 2022, 2212.14052.
[2] Yacine Jernite,et al. BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model , 2022, ArXiv.
[3] Dragomir R. Radev,et al. Crosslingual Generalization through Multitask Finetuning , 2022, ArXiv.
[4] Anne-Laure Ligozat,et al. Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model , 2022, J. Mach. Learn. Res..
[5] Zheng Xin Yong,et al. What Language Model to Train if You Have One Million GPU Hours? , 2022, EMNLP.
[6] Tatiana Shavrina,et al. Universal and Independent: Multilingual Probing Framework for Exhaustive Model Interpretation and Evaluation , 2022, BLACKBOXNLP.
[7] Quoc V. Le,et al. Transcending Scaling Laws with 0.1% Extra Compute , 2022, EMNLP.
[8] Nils Reimers,et al. MTEB: Massive Text Embedding Benchmark , 2022, ArXiv.
[9] Stella Rose Biderman,et al. EleutherAI: Going Beyond "Open Science" to "Science in the Open" , 2022, ArXiv.
[10] P. Zhang,et al. GLM-130B: An Open Bilingual Pre-trained Model , 2022, ICLR.
[11] M. Lewis,et al. LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale , 2022, ArXiv.
[12] Jack G. M. FitzGerald,et al. AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model , 2022, ArXiv.
[13] Khalid N. Elmadani,et al. Masader Plus: A New Interface for Exploring +500 Arabic NLP Datasets , 2022, ArXiv.
[14] Inioluwa Deborah Raji,et al. The Fallacy of AI Functionality , 2022, FAccT.
[15] J. Dean,et al. Emergent Abilities of Large Language Models , 2022, Trans. Mach. Learn. Res..
[16] Gerard de Melo,et al. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models , 2022, ArXiv.
[17] Dragomir R. Radev,et al. Data Governance in the Age of Large-Scale Data-Driven Language Technology , 2022, FAccT.
[18] Xi Victoria Lin,et al. OPT: Open Pre-trained Transformer Language Models , 2022, ArXiv.
[19] Harish Tayyar Madabushi,et al. SemEval-2022 Task 2: Multilingual Idiomaticity Detection and Sentence Embedding , 2022, SEMEVAL.
[20] Jack G. M. FitzGerald,et al. MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages , 2022, ACL.
[21] Tatiana Shavrina,et al. mGPT: Few-Shot Learners Go Multilingual , 2022, ArXiv.
[22] Stella Rose Biderman,et al. GPT-NeoX-20B: An Open-Source Autoregressive Language Model , 2022, BIGSCIENCE.
[23] InCoder: A Generative Model for Code Infilling and Synthesis , 2022, 2204.05999.
[24] Hyung Won Chung,et al. What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization? , 2022, ICML.
[25] Javier de la Rosa,et al. Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0 , 2022, BIGSCIENCE.
[26] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[27] Rebecca Lynn Johnson,et al. The Ghost in the Machine has an American accent: value conflict in GPT-3 , 2022, ArXiv.
[28] Niklas Muennighoff. SGPT: GPT Sentence Embeddings for Semantic Search , 2022, ArXiv.
[29] Sebastian Gehrmann,et al. Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text , 2022, J. Artif. Intell. Res..
[30] Cherepanov,et al. Competition-level code generation with AlphaCode , 2022, Science.
[31] Alexander M. Rush,et al. PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts , 2022, ACL.
[32] Hady Elsahar,et al. Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources , 2022, ArXiv.
[33] Stella Biderman,et al. Datasheet for the Pile , 2022, ArXiv.
[34] Albert Gu,et al. Efficiently Modeling Long Sequences with Structured State Spaces , 2021, ICLR.
[35] Alexander M. Rush,et al. Multitask Prompted Training Enables Zero-Shot Task Generalization , 2021, ICLR.
[36] Mustafa Ghaleb,et al. Masader: Metadata Sourcing for Arabic Text and Speech Data Resources , 2021, LREC.
[37] Quoc V. Le,et al. Finetuned Language Models Are Zero-Shot Learners , 2021, ICLR.
[38] Noah A. Smith,et al. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation , 2021, ICLR.
[39] William Agnew,et al. The Values Encoded in Machine Learning Research , 2021, FAccT.
[40] Marc'Aurelio Ranzato,et al. The Flores-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation , 2021, TACL.
[41] Ankur Bapna,et al. Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets , 2021, TACL.
[42] Yonatan Belinkov,et al. Probing Classifiers: Promises, Shortcomings, and Advances , 2021, CL.
[43] Brent J. Hecht,et al. Behavioral Use Licensing for Responsible AI , 2020, FAccT.
[44] Noah A. Smith,et al. Benchmarking Generalization via In-Context Instructions on 1, 600+ Language Tasks , 2022, ArXiv.
[45] Shachar Mirkin,et al. Emergent Structures and Training Dynamics in Large Language Models , 2022, BIGSCIENCE.
[46] Karën Fort,et al. French CrowS-Pairs: Extending a challenge dataset for measuring social bias in masked language models to a language other than English , 2022, ACL.
[47] Dragomir R. Radev,et al. You reap what you sow: On the Challenges of Bias Evaluation Under Multilingual Settings , 2022, BIGSCIENCE.
[48] M Saiful Bari. Dataset Debt in Biomedical Language Modeling , 2022 .
[49] Junyuan Shang,et al. ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation , 2021, ArXiv.
[50] Elizabeth Salesky,et al. Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP , 2021, ArXiv.
[51] Xi Victoria Lin,et al. Few-shot Learning with Multilingual Generative Language Models , 2021, EMNLP.
[52] Po-Sen Huang,et al. Scaling Language Models: Methods, Analysis & Insights from Training Gopher , 2021, ArXiv.
[53] Amandalynne Paullada,et al. AI and the Everything in the Whole Wide World Benchmark , 2021, NeurIPS Datasets and Benchmarks.
[54] Vinay Uday Prabhu,et al. Multimodal datasets: misogyny, pornography, and malignant stereotypes , 2021, ArXiv.
[55] Kyungduk Kim,et al. What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers , 2021, EMNLP.
[56] Alexander M. Rush,et al. Datasets: A Community Library for Natural Language Processing , 2021, EMNLP.
[57] Wojciech Zaremba,et al. Evaluating Large Language Models Trained on Code , 2021, ArXiv.
[58] Laurent Romary,et al. Ungoliant: An Optimized Pipeline for the Generation of a Very Large-Scale Multilingual Web Corpus , 2021 .
[59] J. Donnelly,et al. External Validation of a Widely Implemented Proprietary Sepsis Prediction Model in Hospitalized Patients. , 2021, JAMA internal medicine.
[60] Praveen K. Paritosh,et al. “Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI , 2021, CHI.
[61] David R. So,et al. Carbon Emissions and Large Neural Network Training , 2021, ArXiv.
[62] Jianlin Su,et al. RoFormer: Enhanced Transformer with Rotary Position Embedding , 2021, Neurocomputing.
[63] Jesse Dodge,et al. Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus , 2021, EMNLP.
[64] Stella Biderman,et al. GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow , 2021 .
[65] Emily M. Bender,et al. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 , 2021, FAccT.
[66] Hyung Won Chung,et al. Do Transformer Modifications Transfer Across Implementations and Applications? , 2021, EMNLP.
[67] Noam M. Shazeer,et al. Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity , 2021, J. Mach. Learn. Res..
[68] Charles Foster,et al. The Pile: An 800GB Dataset of Diverse Text for Language Modeling , 2020, ArXiv.
[69] Iryna Gurevych,et al. How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models , 2020, ACL.
[70] Colin Raffel,et al. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer , 2020, NAACL.
[71] Holger Schwenk,et al. Beyond English-Centric Multilingual Machine Translation , 2020, J. Mach. Learn. Res..
[72] Benoît Crabbé,et al. Un modèle Transformer Génératif Pré-entrainé pour le _ _ _ _ _ _ français (Generative Pre-trained Transformer in _ _ _ _ _ _ (French) We introduce a French adaptation from the well-known GPT model) , 2021, JEPTALNRECITAL.
[73] Aurélie Névéol,et al. Evaluating the carbon footprint of NLP methods: a survey and analysis of existing tools , 2021, SUSTAINLP.
[74] Hanna M. Wallach,et al. Stereotyping Norwegian Salmon: An Inventory of Pitfalls in Fairness Benchmark Datasets , 2021, ACL.
[75] Claire Cardie,et al. WikiLingua: A New Benchmark Dataset for Multilingual Abstractive Summarization , 2020, FINDINGS.
[76] Samuel R. Bowman,et al. CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models , 2020, EMNLP.
[77] Olatunji Ruwase,et al. DeepSpeed: System Optimizations Enable Training Deep Learning Models with Over 100 Billion Parameters , 2020, KDD.
[78] C. Ré,et al. HiPPO: Recurrent Memory with Optimal Polynomial Projections , 2020, NeurIPS.
[79] P. Howard,et al. What to expect when you’re expecting robots: Futures, expectations, and pseudo-artificial general intelligence in UK news , 2020, Journalism.
[80] Deniz Yuret,et al. KUISAIL at SemEval-2020 Task 12: BERT-CNN for Offensive Speech Identification in Social Media , 2020, SEMEVAL.
[81] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[82] Mitesh M. Khapra,et al. AI4Bharat-IndicNLP Corpus: Monolingual Corpora and Word Embeddings for Indic Languages , 2020, ArXiv.
[83] Noam Shazeer,et al. GLU Variants Improve Transformer , 2020, ArXiv.
[84] Alec Radford,et al. Scaling Laws for Neural Language Models , 2020, ArXiv.
[85] Laurent Romary,et al. CamemBERT: a Tasty French Language Model , 2019, ACL.
[86] Myle Ott,et al. Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.
[87] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.
[88] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[89] Samyam Rajbhandari,et al. ZeRO: Memory optimizations Toward Training Trillion Parameter Models , 2019, SC20: International Conference for High Performance Computing, Networking, Storage and Analysis.
[90] Sophie Rosset,et al. DiaBLa: a corpus of bilingual spontaneous written dialogues for machine translation , 2019, Language Resources and Evaluation.
[91] Daniel S. Weld,et al. S2ORC: The Semantic Scholar Open Research Corpus , 2020, ACL.
[92] Alexandre Lacoste,et al. Quantifying the Carbon Emissions of Machine Learning , 2019, ArXiv.
[93] M. Shoeybi,et al. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism , 2019, ArXiv.
[94] John Hewitt,et al. Designing and Interpreting Probes with Control Tasks , 2019, EMNLP.
[95] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[96] Benoît Sagot,et al. Asynchronous Pipeline for Processing Huge Corpora on Medium to Low Resource Infrastructures , 2019 .
[97] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[98] Andrew McCallum,et al. Energy and Policy Considerations for Deep Learning in NLP , 2019, ACL.
[99] Pradeep Dubey,et al. A Study of BFLOAT16 for Deep Learning Training , 2019, ArXiv.
[100] Maosong Sun,et al. ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.
[101] Alex Wang,et al. What do you learn from context? Probing for sentence structure in contextualized word representations , 2019, ICLR.
[102] Omer Levy,et al. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems , 2019, NeurIPS.
[103] Francis M. Tyers,et al. Universal Dependencies , 2017, EACL.
[104] Yonatan Belinkov,et al. Analysis Methods in Neural Language Processing: A Survey , 2018, TACL.
[105] Inioluwa Deborah Raji,et al. Model Cards for Model Reporting , 2018, FAT.
[106] Noah Constant,et al. Character-Level Language Modeling with Deeper Self-Attention , 2018, AAAI.
[107] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[108] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[109] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[110] Guillaume Lample,et al. What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.
[111] Matt Post,et al. A Call for Clarity in Reporting BLEU Scores , 2018, WMT.
[112] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[113] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[114] Willem H. Zuidema,et al. Visualisation and 'diagnostic classifiers' reveal how recurrent and recursive neural networks process hierarchical structure , 2017, J. Artif. Intell. Res..
[115] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.
[116] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[117] J Brennen,et al. An industry-led debate: how UK media cover artificial intelligence , 2018 .
[118] Yang Yang,et al. Deep Learning Scaling is Predictable, Empirically , 2017, ArXiv.
[119] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[120] L. Winner. DO ARTIFACTS HAVE (cid:1) POLITICS? , 2022 .
[121] Yonatan Belinkov,et al. What do Neural Machine Translation Models Learn about Morphology? , 2017, ACL.
[122] Geoffrey E. Hinton,et al. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.
[123] Yonatan Belinkov,et al. Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks , 2016, ICLR.
[124] Allyson Ettinger,et al. Probing for semantic evidence of composition by means of simple classification tasks , 2016, RepEval@ACL.
[125] Sampo Pyysalo,et al. Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.
[126] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Restarts , 2016, ArXiv.
[127] Quoc V. Le,et al. A Neural Conversational Model , 2015, ArXiv.
[128] Philipp Koehn,et al. Findings of the 2014 Workshop on Statistical Machine Translation , 2014, WMT@ACL.
[129] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[130] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[131] Sudhakar Yalamanchili,et al. Optimizing Data Warehousing Applications for GPUs Using Kernel Fusion/Fission , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.
[132] Geoffrey E. Hinton,et al. Generating Text with Recurrent Neural Networks , 2011, ICML.
[133] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..
[134] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[135] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[136] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.
[137] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[138] Joshua Goodman,et al. A bit of progress in language modeling , 2001, Comput. Speech Lang..
[139] Danqi Chen,et al. of the Association for Computational Linguistics: , 2001 .
[140] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[141] Shari Collins-Chobanian. Faces of Environmental Racism: Confronting Issues of Global Justice , 1999 .
[142] Walter Klöpffer,et al. Life cycle assessment , 1997, Environmental science and pollution research international.
[143] Jürgen Schmidhuber,et al. Sequential neural text compression , 1996, IEEE Trans. Neural Networks.
[144] Y. Benjamini,et al. Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .
[145] Philip Gage,et al. A new algorithm for data compression , 1994 .
[146] Risto Miikkulainen,et al. Natural Language Processing With Modular PDP Networks and Distributed Lexicon , 1991, Cogn. Sci..
[147] L. Winner. Autonomous Technology: Technics-out-of-Control as a Theme in Political Thought , 1977 .
[148] Gerard Salton,et al. On the Specification of Term Values in Automatic Indexing , 1973 .