暂无分享,去创建一个
Jianfeng Gao | Xiujun Li | Li Deng | Lihong Li | Zachary C. Lipton | Zachary Chase Lipton | Faisal Ahmed | Lihong Li | Xiujun Li | Jianfeng Gao | L. Deng | Faisal Ahmed
[1] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[2] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[3] Dongho Kim,et al. Incremental on-line adaptation of POMDP-based dialogue managers to extended domains , 2014, INTERSPEECH.
[4] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[5] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.
[6] Geoffrey E. Hinton,et al. Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.
[7] Sharad Vikram,et al. Generative Concatenative Nets Jointly Learn to Write and Classify Reviews , 2015, 1511.03683.
[8] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[9] Roberto Pieraccini,et al. Learning dialogue strategies within the Markov decision process framework , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[10] Steve Young,et al. Statistical User Simulation with a Hidden Agenda , 2007, SIGDIAL.
[11] Jianfeng Gao,et al. A User Simulator for Task-Completion Dialogues , 2016, ArXiv.
[12] Marilyn A. Walker,et al. An Application of Reinforcement Learning to Dialogue Strategy Selection in a Spoken Dialogue System for Email , 2000, J. Artif. Intell. Res..
[13] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[14] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[15] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[16] Alex Graves,et al. Practical Variational Inference for Neural Networks , 2011, NIPS.
[17] Oliver Lemon,et al. Author manuscript, published in "European Conference on Speech Communication and Technologies (Interspeech'07), Anvers: Belgium (2007)" Machine Learning for Spoken Dialogue Systems , 2022 .
[18] Marilyn A. Walker,et al. Reinforcement Learning for Spoken Dialogue Systems , 1999, NIPS.
[19] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[20] Julien Cornebise,et al. Weight Uncertainty in Neural Networks , 2015, ArXiv.
[21] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[22] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.
[23] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[24] Geoffrey Zweig,et al. End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning , 2016, ArXiv.
[25] Milica Gasic,et al. Gaussian Processes for Fast Policy Optimisation of POMDP-based Dialogue Managers , 2010, SIGDIAL Conference.
[26] Steve J. Young,et al. Characterizing task-oriented dialog using a simulated ASR chanel , 2004, INTERSPEECH.
[27] Benjamin Van Roy,et al. (More) Efficient Reinforcement Learning via Posterior Sampling , 2013, NIPS.
[28] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[29] Thomas J. Walsh,et al. Knows what it knows: a framework for self-aware learning , 2008, ICML '08.
[30] Filip De Turck,et al. Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks , 2016, ArXiv.
[31] Jing He,et al. Policy Networks with Two-Stage Training for Dialogue Systems , 2016, SIGDIAL Conference.
[32] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[33] Alex M. Andrew,et al. Reinforcement Learning: : An Introduction , 1998 .
[34] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[35] Lihong Li,et al. A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.
[36] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[37] Heriberto Cuayáhuitl,et al. SimpleDS: A Simple Deep Reinforcement Learning Dialogue System , 2016, IWSDS.
[38] David Vandyke,et al. A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.
[39] Sharad Vikram,et al. Capturing Meaning in Product Reviews with Character-Level Generative Text Models , 2015, ArXiv.
[40] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[41] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[42] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[43] Steve J. Young,et al. Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..
[44] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[45] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[46] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.