Infinite time horizon maximum causal entropy inverse reinforcement learning
暂无分享,去创建一个
[1] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[2] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[3] Gerhard Kramer,et al. Directed information for channels with feedback , 1998 .
[4] Tim Hesterberg,et al. Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control , 2004, Technometrics.
[5] A. Dawid,et al. Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory , 2004, math/0410076.
[6] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[7] Roger G. Ghanem,et al. Asymptotic Sampling Distribution for Polynomial Chaos Representation of Data: A Maximum Entropy and Fisher information approach , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.
[8] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[9] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[10] Lars Peter Hansen,et al. Robust control and model misspecification , 2006, J. Econ. Theory.
[11] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[12] J. Andrew Bagnell,et al. Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .
[13] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..
[14] Anind K. Dey,et al. Modeling Interaction via the Principle of Maximum Causal Entropy , 2010, ICML.
[15] Jerome Le Ny,et al. Feedback control of the National Airspace System to mitigate weather disruptions , 2010, 49th IEEE Conference on Decision and Control (CDC).
[16] Jan Peters,et al. Relative Entropy Inverse Reinforcement Learning , 2011, AISTATS.
[17] Brian D. Ziebart. Factorized decision forecasting via combining value-based and reward-based estimation , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[18] Er Meng Joo,et al. A review of inverse reinforcement learning theory and recent advances , 2012, IEEE Congress on Evolutionary Computation.
[19] Anind K. Dey,et al. The Principle of Maximum Causal Entropy for Estimating Interacting Processes , 2013, IEEE Transactions on Information Theory.
[20] Yi Zhou,et al. Dynamic Queuing Network Model for Flow Contingency Management , 2011, IEEE Transactions on Intelligent Transportation Systems.
[21] Michael Bloem,et al. Ground Delay Program Analytics with Behavioral Cloning and Inverse Reinforcement Learning , 2014, J. Aerosp. Inf. Syst..
[22] Wolfram Burgard,et al. Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics , 2016, AISTATS.