State Abstraction as Compression in Apprenticeship Learning
暂无分享,去创建一个
Lawson L. S. Wong | Kavosh Asadi | Michael L. Littman | Dilip Arumugam | David Abel | Yuu Jinnai | M. Littman | Kavosh Asadi | David Abel | Dilip Arumugam | Yuu Jinnai
[1] F. Attneave. Some informational aspects of visual perception. , 1954, Psychological review.
[2] Toby Berger,et al. Rate distortion theory : a mathematical basis for data compression , 1971 .
[3] Suguru Arimoto,et al. An algorithm for computing the capacity of arbitrary discrete memoryless channels , 1972, IEEE Trans. Inf. Theory.
[4] Richard E. Blahut,et al. Computation of channel capacity and rate-distortion functions , 1972, IEEE Trans. Inf. Theory.
[5] Ward Whitt,et al. Approximations of Dynamic Programs, I , 1978, Math. Oper. Res..
[6] D. Bertsekas,et al. Adaptive aggregation methods for infinite horizon dynamic programming , 1989 .
[7] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[8] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[9] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[10] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[11] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.
[12] Robert Givan,et al. Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.
[13] Milos Hauskrecht,et al. Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.
[14] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[15] Naftali Tishby,et al. Agglomerative Information Bottleneck , 1999, NIPS.
[16] Thomas G. Dietterich. State Abstraction in MAXQ Hierarchical Reinforcement Learning , 1999, NIPS.
[17] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[18] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.
[19] David Andre,et al. State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.
[20] Robert Givan,et al. Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..
[21] Doina Precup,et al. Metrics for Finite Markov Decision Processes , 2004, AAAI.
[22] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[23] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[24] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[25] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[26] Prasad Tadepalli,et al. Automatic Induction of MAXQ Hierarchies , 2007 .
[27] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[28] Scott Kuindersma,et al. Constructing Skill Trees for Reinforcement Learning Agents from Demonstration Trajectories , 2010, NIPS.
[29] Andrea Lockerd Thomaz,et al. Automatic State Abstraction from Demonstration , 2011, IJCAI.
[30] Charles Kemp,et al. How to Grow a Mind: Statistics, Structure, and Abstraction , 2011, Science.
[31] Daniel Polani,et al. Information Theory of Decisions and Actions , 2011 .
[32] Doina Precup,et al. An information-theoretic approach to curiosity-driven reinforcement learning , 2012, Theory in Biosciences.
[33] H. B. Barlow,et al. Possible Principles Underlying the Transformations of Sensory Messages , 2012 .
[34] Naftali Tishby,et al. Trading Value and Information in MDPs , 2012 .
[35] Thomas G. Dietterich,et al. State Aggregation in Monte Carlo Tree Search , 2014, AAAI.
[36] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[37] Alec Solway,et al. Optimal Behavioral Hierarchy , 2014, PLoS Comput. Biol..
[38] Nan Jiang,et al. Improving UCT planning via approximate homomorphisms , 2014, AAMAS.
[39] Marc G. Bellemare,et al. Compress and Control , 2015, AAAI.
[40] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[41] Anant Sahai,et al. Control capacity , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).
[42] John Lygeros,et al. Efficient Approximation of Channel Capacities , 2015, IEEE Transactions on Information Theory.
[43] Alec Solway,et al. Reinforcement learning, efficient coding, and the statistics of natural tasks , 2015, Current Opinion in Behavioral Sciences.
[44] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[45] Yuval Peres,et al. Rate-limited control of systems with uncertain gain , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[46] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[47] C. Sims. Rate–distortion theory and human perception , 2016, Cognition.
[48] Michael L. Littman,et al. Near Optimal Behavior via Approximate State Abstraction , 2016, ICML.
[49] Victoria Kostina,et al. Information Performance Tradeoffs in Control , 2016, ArXiv.
[50] M. Littman,et al. Toward Good Abstractions for Lifelong Learning , 2017 .
[51] David J. Schwab,et al. The Deterministic Information Bottleneck , 2015, Neural Computation.
[52] Andriy Mnih,et al. Disentangling by Factorising , 2018, ICML.
[53] Anru Zhang,et al. State Compression of Markov Processes via Empirical Low-Rank Estimation , 2018, ArXiv.
[54] Chris R. Sims,et al. Policy Generalization In Capacity-Limited Reinforcement Learning , 2018 .
[55] Michael L. Littman,et al. State Abstractions for Lifelong Reinforcement Learning , 2018, ICML.
[56] Marc G. Bellemare,et al. Approximate Exploration through State Abstraction , 2018, ArXiv.
[57] Chris R Sims,et al. Efficient coding explains the universal law of generalization in human perception , 2018, Science.
[58] Nan Jiang,et al. Hierarchical Imitation and Reinforcement Learning , 2018, ICML.
[59] Sergey Levine,et al. InfoBot: Structured Exploration in ReinforcementLearning Using Information Bottleneck , 2019 .
[60] Joelle Pineau,et al. The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach , 2018, J. Artif. Intell. Res..