Learning Goal-oriented Dialogue Policy with opposite Agent Awareness
暂无分享,去创建一个
Tat-Seng Chua | Minlie Huang | Lizi Liao | Yan Huang | Zitao Liu | Xiaoyan Zhu | Zheng Zhang | Minlie Huang | Tat-Seng Chua | Zitao Liu | Xiaoyan Zhu | Lizi Liao | Zheng Zhang | Yi-Feng Huang
[1] David Vandyke,et al. Continuously Learning Neural Dialogue Management , 2016, ArXiv.
[2] Mary P. Harper,et al. Learning from 26 Languages: Program Management and Science in the Babel Program , 2014, COLING.
[3] Milica Gasic,et al. POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.
[4] Steve J. Young,et al. Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..
[5] Kam-Fai Wong,et al. Integrating planning for task-completion dialogue policy learning , 2018, ACL.
[6] Yiming Yang,et al. Switch-based Active Deep Dyna-Q: Efficient Adaptive Planning for Task-Completion Dialogue Policy Learning , 2018, AAAI.
[7] Gökhan Tür,et al. User Modeling for Task Oriented Dialogues , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[8] Enhong Chen,et al. Budgeted Policy Learning for Task-Oriented Dialogue Systems , 2019, ACL.
[9] Mike Lewis,et al. Hierarchical Text Generation and Planning for Strategic Dialogue , 2017, ICML.
[10] Jianfeng Gao,et al. End-to-End Task-Completion Neural Dialogue Systems , 2017, IJCNLP.
[11] Seunghak Yu,et al. Scaling up deep reinforcement learning for multi-domain dialogue systems , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).
[12] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[13] R. Gordon. Folk Psychology as Simulation , 1986 .
[14] Hui Ye,et al. Agenda-Based User Simulation for Bootstrapping a POMDP Dialogue System , 2007, NAACL.
[15] M. Tomasello,et al. Does the chimpanzee have a theory of mind? 30 years later , 2008, Trends in Cognitive Sciences.
[16] Maxine Eskénazi,et al. Towards End-to-End Learning for Dialog State Tracking and Management using Deep Reinforcement Learning , 2016, SIGDIAL Conference.
[17] Hao Tian,et al. Policy Learning for Domain Selection in an Extensible Multi-domain Spoken Dialogue System , 2014, EMNLP.
[18] Dilek Z. Hakkani-Tür,et al. Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems , 2018, NAACL.
[19] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[20] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[21] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[22] Jianfeng Gao,et al. A User Simulator for Task-Completion Dialogues , 2016, ArXiv.
[23] Steve J. Young,et al. A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies , 2006, The Knowledge Engineering Review.
[24] Yang Feng,et al. Bridging the Gap between Training and Inference for Neural Machine Translation , 2019, ACL.
[25] Kam-Fai Wong,et al. Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning , 2017, EMNLP.
[26] Oliver Lemon,et al. Reinforcement Learning for Adaptive Dialogue Systems - A Data-driven Methodology for Dialogue Management and Natural Language Generation , 2011, Theory and Applications of Natural Language Processing.
[27] Roberto Pieraccini,et al. Learning dialogue strategies within the Markov decision process framework , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[28] YoungSteve,et al. A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies , 2006 .
[29] Yann Dauphin,et al. Deal or No Deal? End-to-End Learning of Negotiation Dialogues , 2017, EMNLP.
[30] Iñigo Casanueva,et al. Neural User Simulation for Corpus-based Policy Optimisation of Spoken Dialogue Systems , 2018, SIGDIAL Conference.
[31] Geoffrey Zweig,et al. Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning , 2017, ACL.
[32] Minlie Huang,et al. Guided Dialog Policy Learning: Reward Estimation for Multi-Domain Task-Oriented Dialog , 2019, EMNLP.
[33] Jing He,et al. A Sequence-to-Sequence Model for User Simulation in Spoken Dialogue Systems , 2016, INTERSPEECH.
[34] Jianfeng Gao,et al. Discriminative Deep Dyna-Q: Robust Planning for Dialogue Policy Learning , 2018, EMNLP.
[35] Jianfeng Gao,et al. Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access , 2016, ACL.
[36] Pararth Shah,et al. Multi-Action Dialog Policy Learning with Interactive Human Teaching , 2020, SIGDIAL.
[37] Lihong Li,et al. Neural Approaches to Conversational AI , 2019, Found. Trends Inf. Retr..
[38] Jing He,et al. Policy Networks with Two-Stage Training for Dialogue Systems , 2016, SIGDIAL Conference.
[39] A. Goldman,et al. Mirror neurons and the simulation theory of mind-reading , 1998, Trends in Cognitive Sciences.
[40] Stefan Ultes,et al. Feudal Reinforcement Learning for Dialogue Management in Large Domains , 2018, NAACL.
[41] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[42] David Vandyke,et al. Policy committee for adaptation in multi-domain spoken dialogue systems , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[43] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[44] Stefan Ultes,et al. MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling , 2018, EMNLP.
[45] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[46] Minlie Huang,et al. Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition , 2020, ACL.
[47] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[48] Bing Liu,et al. Iterative policy learning in end-to-end trainable task-oriented neural dialog models , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[49] Jianfeng Gao,et al. BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems , 2016, AAAI.