暂无分享,去创建一个
[1] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[2] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[3] Toshiaki Nakazawa,et al. ASPEC: Asian Scientific Paper Excerpt Corpus , 2016, LREC.
[4] Graham Neubig,et al. Understanding Knowledge Distillation in Non-autoregressive Machine Translation , 2020, ICLR.
[5] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[6] Changhan Wang,et al. Levenshtein Transformer , 2019, NeurIPS.
[7] Jiajun Zhang,et al. Addressing the Under-Translation Problem from the Entropy Perspective , 2019, AAAI.
[8] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[9] Matt Post,et al. A Call for Clarity in Reporting BLEU Scores , 2018, WMT.
[10] Katsuhito Sudoh,et al. Incorporating Noisy Length Constraints into Transformer with Length-aware Positional Encodings , 2020, COLING.
[11] Taku Kudo,et al. MeCab : Yet Another Part-of-Speech and Morphological Analyzer , 2005 .
[12] Satoshi Nakamura,et al. Length-constrained Neural Machine Translation using Length Prediction and Perturbation into Length-aware Positional Encoding , 2021, Journal of Natural Language Processing.
[13] Philipp Koehn,et al. Findings of the 2014 Workshop on Statistical Machine Translation , 2014, WMT@ACL.
[14] Naoaki Okazaki,et al. Positional Encoding to Control Output Sequence Length , 2019, NAACL.
[15] Marcello Federico,et al. Controlling the Output Length of Neural Machine Translation , 2019, IWSLT.
[16] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[17] Omer Levy,et al. Mask-Predict: Parallel Decoding of Conditional Masked Language Models , 2019, EMNLP.