Natural Actor and Belief Critic: Reinforcement algorithm for learning parameters of dialogue systems modelled as POMDPs
暂无分享,去创建一个
[1] Nasser M. Nasrabadi,et al. Pattern Recognition and Machine Learning , 2006, Technometrics.
[2] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.
[3] Jürgen Schmidhuber,et al. Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.
[4] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[5] Joelle Pineau,et al. Spoken Dialogue Management Using Probabilistic Reasoning , 2000, ACL.
[6] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[7] Jürgen Schmidhuber,et al. Recurrent policy gradients , 2010, Log. J. IGPL.
[8] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[9] Jason Williams,et al. Using Automatically Transcribed Dialogs to Learn User Models in a Spoken Dialog System , 2008, ACL.
[10] Douglas Aberdeen,et al. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes , 2003 .
[11] Maxine Eskénazi,et al. Let's go public! taking a spoken dialog system to the real world , 2005, INTERSPEECH.
[12] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[13] Blaise Roger Marie Thomson,et al. Statistical methods for spoken dialogue management , 2013 .
[14] J.D. Williams,et al. Scaling up POMDPs for Dialog Management: The ``Summary POMDP'' Method , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..
[15] Steve J. Young,et al. Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems , 2010, Comput. Speech Lang..
[16] Milica Gasic,et al. Natural belief-critic: a reinforcement algorithm for parameter estimation in statistical spoken dialogue systems , 2010, INTERSPEECH.
[17] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[18] Jason Williams. Demonstration of a POMDP Voice Dialer , 2008, ACL.
[19] Matthieu Geist,et al. Kalman Temporal Differences: The deterministic case , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[20] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[21] Jason D. Williams. Integrating expert knowledge into POMDP optimization for spoken dialog systems , 2008 .
[22] A. E. Hoerl,et al. Ridge regression: biased estimation for nonorthogonal problems , 2000 .
[23] J. Schatztnann,et al. Effects of the user model on simulation-based learning of dialogue strategies , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..
[24] Kallirroi Georgila,et al. Automatic annotation of COMMUNICATOR dialogue data for learning dialogue strategies and user simulations , 2005 .
[25] Nikolaus Hansen,et al. Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.
[26] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[27] Milica Gasic,et al. The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management , 2010, Comput. Speech Lang..
[28] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[29] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .
[30] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[31] Hyeong Seop Sim,et al. Effects of user modeling on POMDP-based dialogue systems , 2008, INTERSPEECH.
[32] James C. Spall,et al. Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.
[33] Baining Guo,et al. Planning and Acting under Uncertainty: A New Model for Spoken Dialogue System , 2001, UAI.
[34] Nicholas Roy,et al. Efficient model learning for dialog management , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).