Trading Value and Information in MDPs
暂无分享,去创建一个
[1] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[2] Aaron D. Wyner,et al. Coding Theorems for a Discrete Source With a Fidelity CriterionInstitute of Radio Engineers, International Convention Record, vol. 7, 1959. , 1993 .
[3] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[4] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[5] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.
[6] J. Fuster. The Prefrontal Cortex—An Update Time Is of the Essence , 2001, Neuron.
[7] David A. McAllester. Simplified PAC-Bayesian Margin Bounds , 2003, COLT.
[8] Emanuel Todorov,et al. Efficient computation of optimal actions , 2009, Proceedings of the National Academy of Sciences.
[9] Karl J. Friston. The free-energy principle: a rough guide to the brain? , 2009, Trends in Cognitive Sciences.
[10] Daniel Polani,et al. Information Theory of Decisions and Actions , 2011 .
[11] Stefan Schaal,et al. Path integral control and bounded rationality , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[12] Vicenç Gómez,et al. Optimal control as a graphical model inference problem , 2009, Machine Learning.