暂无分享,去创建一个
[1] Steve J. Young,et al. Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..
[2] Kam-Fai Wong,et al. Integrating planning for task-completion dialogue policy learning , 2018, ACL.
[3] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[4] Milica Gasic,et al. Gaussian Processes for POMDP-Based Dialogue Manager Optimization , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[5] Kam-Fai Wong,et al. Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning , 2017, EMNLP.
[6] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[7] Geoffrey Zweig,et al. Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning , 2017, ACL.
[8] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[9] Jianfeng Gao,et al. Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access , 2016, ACL.
[10] Bing Liu,et al. Adversarial Learning of Task-Oriented Neural Dialog Models , 2018, SIGDIAL Conference.
[11] Minlie Huang,et al. Guided Dialog Policy Learning: Reward Estimation for Multi-Domain Task-Oriented Dialog , 2019, EMNLP.
[12] Jianfeng Gao,et al. Guided Dialog Policy Learning without Adversarial Learning in the Loop , 2020, EMNLP.
[13] Antoine Raux,et al. The Dialog State Tracking Challenge Series , 2014, AI Mag..
[14] Joseph Weizenbaum,et al. ELIZA—a computer program for the study of natural language communication between man and machine , 1966, CACM.
[15] Sungjin Lee,et al. ConvLab: Multi-Domain End-to-End Dialog System Platform , 2019, ACL.
[16] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[17] E. Gumbel. Statistical Theory of Extreme Values and Some Practical Applications : A Series of Lectures , 1954 .
[18] David Vandyke,et al. On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems , 2016, ACL.
[19] Jianfeng Gao,et al. BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems , 2016, AAAI.
[20] Kam-Fai Wong,et al. Adversarial Advantage Actor-Critic Model for Task-Completion Dialogue Policy Learning , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Jason Weston,et al. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.
[22] Stefan Ultes,et al. Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management , 2017, SIGDIAL Conference.
[23] Léon Bottou,et al. Wasserstein GAN , 2017, ArXiv.
[24] David Vandyke,et al. Policy committee for adaptation in multi-domain spoken dialogue systems , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[25] Jianfeng Gao,et al. Discriminative Deep Dyna-Q: Robust Planning for Dialogue Policy Learning , 2018, EMNLP.
[26] Jianfeng Gao,et al. End-to-End Task-Completion Neural Dialogue Systems , 2017, IJCNLP.
[27] Jiliang Tang,et al. A Survey on Dialogue Systems: Recent Advances and New Frontiers , 2017, SKDD.
[28] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.
[29] Marilyn A. Walker,et al. PARADISE: A Framework for Evaluating Spoken Dialogue Agents , 1997, ACL.