暂无分享,去创建一个
Rahul Goel | Angeliki Metallinou | Ashwin Ram | Rohit Prasad | Shaohua Yang | Ming Cheng | Fenfei Guo | Behnam Hedayatnia | Chandra Khatri | Anirudh Raju | Anu Venkatesh | Raefer Gabriel | Ashish Nagar | A. Ram | R. Prasad | Shaohua Yang | A. Raju | Rahul Goel | A. Metallinou | Chandra Khatri | Behnam Hedayatnia | Anu Venkatesh | Raefer Gabriel | Fenfei Guo | Ming Cheng | Ashish Nagar
[1] Alan Ritter,et al. Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.
[2] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.
[3] Joelle Pineau,et al. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.
[4] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[5] Joseph Weizenbaum,et al. and Machine , 1977 .
[6] Michael White,et al. Further Meta-Evaluation of Broad-Coverage Surface Realization , 2010, EMNLP.
[7] Ondrej Bojar,et al. Results of the WMT17 Metrics Shared Task , 2017, WMT.
[8] Marilyn A. Walker,et al. PARADISE: A Framework for Evaluating Spoken Dialogue Agents , 1997, ACL.
[9] Aoife Cahill. Correlating Human and Automatic Evaluation of a German Surface Realiser , 2009, ACL/IJCNLP.
[10] Hector J. Levesque,et al. Common Sense, the Turing Test, and the Quest for Real AI , 2017 .
[11] Allison Woodruff,et al. Detecting user engagement in everyday conversations , 2004, INTERSPEECH.
[12] Timothy Baldwin,et al. Accurate Evaluation of Segment-level Machine Translation Metrics , 2015, NAACL.
[13] J Elith,et al. A working guide to boosted regression trees. , 2008, The Journal of animal ecology.
[14] Eric Steven Atwell,et al. Different measurement metrics to evaluate a chatbot system , 2007, HLT-NAACL 2007.
[15] Jennifer Chu-Carroll,et al. MIMIC: An Adaptive Mixed Initiative Spoken Dialogue System for Information Queries , 2000, ANLP.
[16] Rebecca Hwa,et al. Regression for Sentence-Level MT Evaluation with Pseudo References , 2007, ACL.
[17] Joelle Pineau,et al. A Survey of Available Corpora for Building Data-Driven Dialogue Systems , 2015, Dialogue Discourse.
[18] James F. Allen,et al. TRAINS-95: Towards a Mixed-Initiative Planning Assistant , 1996, AIPS.
[19] Joelle Pineau,et al. Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses , 2017, ACL.
[20] B. M. Bennett. On an Approximate Test for Homogeneity of Coefficients of Variation , 1976 .
[21] Ryuichiro Higashinaka,et al. Evaluating coherence in open domain conversational systems , 2014, INTERSPEECH.
[22] Gisela Redeker,et al. On differences between spoken and written language , 1984 .
[23] Joelle Pineau,et al. How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.
[24] Timothy Baldwin,et al. Can machine translation systems be evaluated by the crowd alone , 2015, Natural Language Engineering.
[25] Samy Bengio,et al. Generating Sentences from a Continuous Space , 2015, CoNLL.
[26] K. Á. T.,et al. Towards a tool for the Subjective Assessment of Speech System Interfaces (SASSI) , 2000, Natural Language Engineering.
[27] Josef van Genabith,et al. ReVal: A Simple and Effective Machine Translation Evaluation Metric Based on Recurrent Neural Networks , 2015, EMNLP.
[28] Philipp Koehn,et al. Findings of the 2011 Workshop on Statistical Machine Translation , 2011, WMT@EMNLP.
[29] A. M. Turing,et al. Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.