Towards Quantifiable Dialogue Coherence Evaluation
暂无分享,去创建一个
Liang Lin | Zheng Ye | Xiaodan Liang | Liucun Lu | Lishan Huang | Liang Lin | Xiaodan Liang | Lishan Huang | Zheng Ye | Liucun Lu
[1] Dongyan Zhao,et al. RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems , 2017, AAAI.
[2] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[3] Xin Jiang,et al. TinyBERT: Distilling BERT for Natural Language Understanding , 2019, FINDINGS.
[4] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.
[5] Verena Rieser,et al. Why We Need New Evaluation Metrics for NLG , 2017, EMNLP.
[6] Joelle Pineau,et al. How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.
[7] Kilian Q. Weinberger,et al. BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.
[8] Xiaoyu Shen,et al. DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset , 2017, IJCNLP.
[9] Iryna Gurevych,et al. Dialogue Coherence Assessment Without Explicit Dialogue Act Labels , 2019, ACL.
[10] Joelle Pineau,et al. Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses , 2017, ACL.
[11] Shujian Huang,et al. Online Distilling from Checkpoints for Neural Machine Translation , 2019, NAACL.
[12] Ce Liu,et al. Supervised Contrastive Learning , 2020, NeurIPS.
[13] Thibault Sellam,et al. BLEURT: Learning Robust Metrics for Text Generation , 2020, ACL.
[14] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.
[15] Zhangyang Wang,et al. In Defense of the Triplet Loss Again: Learning Robust Person Re-Identification with Fast Approximated Triplet Loss and Label Distillation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[16] Jiajun Zhang,et al. Distill and Replay for Continual Language Learning , 2020, COLING.
[17] Joelle Pineau,et al. The Second Conversational Intelligence Challenge (ConvAI2) , 2019, The NeurIPS '18 Competition.
[18] Lynda Tamine,et al. Knowledge Base Embedding By Cooperative Knowledge Distillation , 2020, COLING.
[19] Mitesh M. Khapra,et al. Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining , 2020, Transactions of the Association for Computational Linguistics.
[20] Y-Lan Boureau,et al. Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset , 2018, ACL.
[21] Evgeny A. Stepanov,et al. Coherence Models for Dialogue , 2018, INTERSPEECH.
[22] Yan Wang,et al. The World Is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection , 2020, EMNLP.
[23] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[24] Nanyun Peng,et al. Better Automatic Evaluation of Open-Domain Dialogue Systems with Contextualized Embeddings , 2019, Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation.
[25] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[26] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.