Metaphorical User Simulators for Evaluating Task-oriented Dialogue Systems
暂无分享,去创建一个
M. de Rijke | Z. Ren | Zhumin Chen | Pengjie Ren | Shuo Zhang | Weiwei Sun | Shuyu Guo
[1] K. Balog,et al. Analyzing and Simulating User Utterance Reformulation in Conversational Recommender Systems , 2022, SIGIR.
[2] Jason Weston,et al. Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents , 2022, NLP4CONVAI.
[3] Yinhe Zheng,et al. GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with Semi-Supervised Learning and Explicit Policy Injection , 2021, AAAI.
[4] Elman Mansimov,et al. Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System , 2021, ACL.
[5] Yuhang Guo,et al. Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese , 2021, ArXiv.
[6] Baolin Peng,et al. Soloist: Building Task Bots at Scale with Transfer Learning and Machine Teaching , 2021, Transactions of the Association for Computational Linguistics.
[7] Bill Byrne,et al. Transferable Dialogue Systems and User Simulators , 2021, ACL.
[8] Paul Thomas,et al. Sim4IR: The SIGIR 2021 Workshop on Simulation for Information Retrieval Evaluation , 2021, SIGIR.
[9] Chengxiang Zhai,et al. An Exploration of Tester-based Evaluation of User Simulators for Comparing Interactive Retrieval Systems. , 2021, SIGIR.
[10] Ondrej Dusek,et al. Shades of BLEU, Flavours of Success: The Case of MultiWOZ , 2021, GEM.
[11] M. de Rijke,et al. Wizard of Search Engine: Access to Information Through Conversations with Search Engines , 2021, SIGIR.
[12] Zhou Yu,et al. Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue StateTracking , 2021, NAACL.
[13] M. de Rijke,et al. Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems , 2021, SIGIR.
[14] M. de Rijke,et al. Advances and Challenges in Conversational Recommender Systems: A Survey , 2021, AI Open.
[15] Xiaojun Quan,et al. UBAR: Towards Fully End-to-End Task-Oriented Dialog Systems with GPT-2 , 2020, AAAI.
[16] Minlie Huang,et al. CR-Walker: Tree-Structured Graph Reasoning and Dialog Acts for Conversational Recommendation , 2020, EMNLP.
[17] Minlie Huang,et al. MultiWOZ 2.3: A Multi-domain Task-Oriented Dialogue Dataset Enhanced with Annotation Corrections and Co-Reference Annotation , 2020, NLPCC.
[18] Nicola De Cao,et al. KILT: a Benchmark for Knowledge Intensive Language Tasks , 2020, NAACL.
[19] K. Balog. Conversational AI from an Information Retrieval Perspective: Remaining Challenges and a Case for User Simulation , 2021, DESIRES.
[20] M. de Rijke,et al. Keeping Dataset Biases out of the Simulation: A Debiased Simulator for Reinforcement Learning based Recommender Systems , 2020, RecSys.
[21] Andrea Madotto,et al. Language Models as Few-Shot Learner for Task-Oriented Dialogue Systems , 2020, ArXiv.
[22] M. de Rijke,et al. Conversational Recommendation: Formulation, Methods, and Evaluation , 2020, SIGIR.
[23] Yulong Gu,et al. Neural Interactive Collaborative Filtering , 2020, SIGIR.
[24] Elizabeth Clark,et al. Evaluation of Text Generation: A Survey , 2020, ArXiv.
[25] Krisztian Balog,et al. Evaluating Conversational Recommender Systems via User Simulation , 2020, KDD.
[26] R. Socher,et al. A Simple Language Model for Task-Oriented Dialogue , 2020, Neural Information Processing Systems.
[27] Zheng Zhang,et al. Recent advances and challenges in task-oriented dialog systems , 2020, Science China Technological Sciences.
[28] Jimmy J. Lin,et al. Document Ranking with a Pretrained Sequence-to-Sequence Model , 2020, FINDINGS.
[29] Xiangnan He,et al. Estimation-Action-Reflection: Towards Deep Interaction Between Conversational and Recommender Systems , 2020, WSDM.
[30] Xiaodong He,et al. The JDDC Corpus: A Large-Scale Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service , 2019, LREC.
[31] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[32] Zhou Yu,et al. MOSS: End-to-End Dialog System Framework with Modular Supervision , 2019, AAAI.
[33] Anuj Kumar Goyal,et al. MultiWOZ 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines , 2019, LREC.
[34] Arantxa Otegi,et al. Survey on evaluation methods for dialogue systems , 2019, Artificial Intelligence Review.
[35] Zhou Yu,et al. How to Build User Simulators to Train RL-based Dialog Systems , 2019, EMNLP.
[36] Gökhan Tür,et al. Collaborative Multi-Agent Dialogue Model Training Via Reinforcement Learning , 2019, SIGdial.
[37] Nava Tintarev,et al. SIREN: A Simulation Framework for Understanding the Effects of Recommender Systems in Online News Environments , 2019, FAT.
[38] Danqi Chen,et al. CoQA: A Conversational Question Answering Challenge , 2018, TACL.
[39] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[40] Xu Chen,et al. Towards Conversational Search and Recommendation: System Ask, User Respond , 2018, CIKM.
[41] Zhaochun Ren,et al. Explicit State Tracking with Semi-Supervisionfor Neural Dialogue Generation , 2018, CIKM.
[42] Min-Yen Kan,et al. Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures , 2018, ACL.
[43] Yi Zhang,et al. Conversational Recommender System , 2018, SIGIR.
[44] Yinan Zhang,et al. Information Retrieval Evaluation as Search Simulation: A General Formal Framework for IR Evaluation , 2017, ICTIR.
[45] Tsung-Hsien Wen,et al. Neural Belief Tracker: Data-Driven Dialogue State Tracking , 2016, ACL.
[46] David Vandyke,et al. A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.
[47] David Maxwell,et al. Agents, Simulated Users and Humans: An Analysis of Performance and Behaviour , 2016, CIKM.
[48] Jianfeng Gao,et al. A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.
[49] Homa B. Hashemi,et al. Query Intent Detection using Convolutional Neural Networks , 2016 .
[50] David Vandyke,et al. Multi-domain Dialog State Tracking using Recurrent Neural Networks , 2015, ACL.
[51] Milica Gasic,et al. POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.
[52] Helen F. Hastie,et al. A survey on metrics for the evaluation of user simulations , 2012, The Knowledge Engineering Review.
[53] A. Kaal. Metaphor in conversation , 2012 .
[54] Milica Gasic,et al. Real User Evaluation of Spoken Dialogue Systems Using Amazon Mechanical Turk , 2011, INTERSPEECH.
[55] Ben Carterette,et al. Simulating simple user behavior for system effectiveness evaluation , 2011, CIKM '11.
[56] Maxine Eskénazi,et al. Spoken Dialog Challenge 2010: Comparison of Live and Control Test Results , 2011, SIGDIAL Conference.
[57] Anne Leitch,et al. Mental models: an interdisciplinary synthesis of theory and methods , 2011 .
[58] Mattias Heldner,et al. Towards human-like spoken dialogue systems , 2008, Speech Commun..
[59] Jens Lehmann,et al. DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.
[60] Hui Ye,et al. Agenda-Based User Simulation for Bootstrapping a POMDP Dialogue System , 2007, NAACL.
[61] Kallirroi Georgila,et al. Learning user simulations for information state update dialogue systems , 2005, INTERSPEECH.
[62] H. Cuayahuitl,et al. Human-computer dialogue simulation using hidden Markov models , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..
[63] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[64] Lori Lamel,et al. The LIMSI ARISE system , 2000, Speech Commun..
[65] Roberto Pieraccini,et al. User modeling for spoken dialogue system evaluation , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[66] Marilyn A. Walker,et al. PARADISE: A Framework for Evaluating Spoken Dialogue Agents , 1997, ACL.