Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems
暂无分享,去创建一个
[1] Alan Ritter,et al. Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.
[2] Allison Woodruff,et al. Detecting user engagement in everyday conversations , 2004, INTERSPEECH.
[3] Dilek Z. Hakkani-Tür,et al. Towards Coherent and Engaging Spoken Dialog Response Generation Using Automatic Conversation Evaluators , 2019, INLG.
[4] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[5] Xiaojuan Ma,et al. Towards Human-Engaged AI , 2018, IJCAI.
[6] Jason Weston,et al. Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.
[7] Ashwin Ram,et al. Alexa Prize - State of the Art in Conversational AI , 2018, AI Mag..
[8] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[9] Percy Liang,et al. Unifying Human and Statistical Evaluation for Natural Language Generation , 2019, NAACL.
[10] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[11] Oriol Vinyals,et al. Adversarial Evaluation of Dialogue Models , 2017, ArXiv.
[12] Jason Weston,et al. Personalizing Dialogue Agents: I have a dog, do you have pets too? , 2018, ACL.
[13] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.
[14] Angeliki Metallinou,et al. Topic-based Evaluation for Conversational Bots , 2018, ArXiv.
[15] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[16] Dongyan Zhao,et al. Learning to Converse with Noisy Data: Generation with Calibration , 2018, IJCAI.
[17] Helen Hastie,et al. Metrics and Evaluation of Spoken Dialogue Systems , 2012 .
[18] Yonatan Belinkov,et al. Linguistic Knowledge and Transferability of Contextual Representations , 2019, NAACL.
[19] Joelle Pineau,et al. How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.
[20] Alexander I. Rudnicky,et al. A Wizard-of-Oz Study on A Non-Task-Oriented Dialog Systems That Reacts to User Engagement , 2016, SIGDIAL Conference.
[21] Birk Diedenhofen,et al. cocor: A Comprehensive Solution for the Statistical Comparison of Correlations , 2015, PloS one.
[22] Joelle Pineau,et al. Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses , 2017, ACL.
[23] Rahul Goel,et al. On Evaluating and Comparing Open Domain Dialog Systems , 2018 .
[24] Eric Horvitz,et al. Learning to Predict Engagement with a Spoken Dialog System in Open-World Settings , 2009, SIGDIAL Conference.
[25] Jason Weston,et al. What makes a good conversation? How controllable attributes affect human judgments , 2019, NAACL.
[26] Varvara Logacheva,et al. ConvAI Dataset of Topic-Oriented Human-to-Chatbot Dialogues , 2018 .
[27] Nanyun Peng,et al. Better Automatic Evaluation of Open-Domain Dialogue Systems with Contextualized Embeddings , 2019, Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation.
[28] Natasha Jaques,et al. Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems , 2019, NeurIPS.
[29] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[30] Verena Rieser,et al. Why We Need New Evaluation Metrics for NLG , 2017, EMNLP.
[31] Dongyan Zhao,et al. RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems , 2017, AAAI.
[32] Alexander I. Rudnicky,et al. A Dataset of Topic-Oriented Human-to-Chatbot Dialogues , 2018 .
[33] Tatsuya Kawahara,et al. Engagement Recognition in Spoken Dialogue via Neural Network by Aggregating Different Annotators' Models , 2018, INTERSPEECH.