论文信息 - The Cost of Training NLP Models: A Concise Overview

The Cost of Training NLP Models: A Concise Overview

We review the cost of training large-scale language models, and the drivers of these costs. The intended audience includes engineers and scientists budgeting their model-training experiments, as well as non-practitioners trying to make sense of the economics of modern-day Natural Language Processing (NLP).

[1] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[2] Neil Genzlinger. A. and Q , 2006 .

[3] Quoc V. Le,et al. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.

[4] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[5] Qun Liu,et al. TinyBERT: Distilling BERT for Natural Language Understanding , 2020, EMNLP.

[6] Jonathan S. Rosenfeld,et al. A Constructive Prediction of the Generalization Error Across Scales , 2020, ICLR.

[7] Lukasz Kaiser,et al. Reformer: The Efficient Transformer , 2020, ICLR.

[8] Matthijs Douze,et al. Fixing the train-test resolution discrepancy , 2019, NeurIPS.

[9] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[10] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[11] Yoav Shoham,et al. SenseBERT: Driving Some Sense into BERT , 2019, ACL.

[12] Dan Klein,et al. Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers , 2020, ArXiv.

[13] Donald Geman,et al. Visual Turing test for computer vision systems , 2015, Proceedings of the National Academy of Sciences.

[14] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .

[15] Alec Radford,et al. Scaling Laws for Neural Language Models , 2020, ArXiv.

[16] Roy Schwartz,et al. Show Your Work: Improved Reporting of Experimental Results , 2019, EMNLP.