暂无分享,去创建一个
[1] David Silver,et al. Memory-based control with recurrent neural networks , 2015, ArXiv.
[2] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[3] Sebastian Nowozin,et al. Improved Information Gain Estimates for Decision Tree Induction , 2012, ICML.
[4] Daniel Polani,et al. Information Theory of Decisions and Actions , 2011 .
[5] Dimitris Kugiumtzis,et al. Markov chain order estimation with conditional mutual information , 2013 .
[6] David Barber,et al. Variational Information Maximization for Neural Coding , 2004, ICONIP.
[7] John Langford,et al. PAC Reinforcement Learning with Rich Observations , 2016, NIPS.
[8] William Bialek,et al. Entropy and Inference, Revisited , 2001, NIPS.
[9] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[10] P. Grassberger. Entropy Estimates from Insufficient Samplings , 2003, physics/0307138.
[11] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[12] James P. Crutchfield,et al. Computational Mechanics: Pattern and Prediction, Structure and Simplicity , 1999, ArXiv.
[13] Jianfeng Gao,et al. Recurrent Reinforcement Learning: A Hybrid Approach , 2015, ArXiv.
[14] Korbinian Strimmer,et al. Entropy Inference and the James-Stein Estimator, with Application to Nonlinear Gene Association Networks , 2008, J. Mach. Learn. Res..
[15] Krishnendu Chatterjee,et al. Qualitative Analysis of Partially-Observable Markov Decision Processes , 2009, MFCS.