暂无分享,去创建一个
Li Zhou | Charles Elkan | Oleg Rokhlenko | Kevin Small | C. Elkan | Kevin Small | O. Rokhlenko | Li Zhou
[1] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[2] Tsung-Hsien Wen,et al. Latent Intention Dialogue Models , 2017, ICML.
[3] Kam-Fai Wong,et al. Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning , 2017, EMNLP.
[4] David Vandyke,et al. On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems , 2016, ACL.
[5] Milica Gasic,et al. POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.
[6] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[7] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[8] Jianfeng Gao,et al. A Neural Network Approach to Context-Sensitive Generation of Conversational Responses , 2015, NAACL.
[9] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[10] Christopher D. Manning,et al. A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue , 2017, EACL.
[11] Joelle Pineau,et al. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.
[12] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[13] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[14] David Vandyke,et al. Continuously Learning Neural Dialogue Management , 2016, ArXiv.
[15] Jianfeng Gao,et al. Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.
[16] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[17] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.
[18] Kirthevasan Kandasamy,et al. Batch Policy Gradient Methods for Improving Neural Conversation Models , 2017, ICLR.
[19] Hervé Frezza-Buet,et al. Sample-efficient batch reinforcement learning for dialogue management optimization , 2011, TSLP.
[20] Geoffrey Zweig,et al. End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning , 2016, ArXiv.
[21] E. Ionides. Truncated Importance Sampling , 2008 .
[22] David Vandyke,et al. A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.
[23] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[24] Joelle Pineau,et al. An Actor-Critic Algorithm for Sequence Prediction , 2016, ICLR.
[25] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[26] He He,et al. Temporal supervised learning for inferring a dialog policy from example conversations , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[27] Stefan Ultes,et al. Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management , 2017, SIGDIAL Conference.
[28] Jason Weston,et al. Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.
[29] Geoffrey Zweig,et al. Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning , 2017, ACL.
[30] Jianfeng Gao,et al. Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access , 2016, ACL.