暂无分享,去创建一个
[1] Pieter Abbeel,et al. Equivalence Between Policy Gradients and Soft Q-Learning , 2017, ArXiv.
[2] Stefano Ermon,et al. Learning Large-Scale Dynamic Discrete Choice Models of Spatio-Temporal Preferences with Application to Migratory Pastoralism in East Africa , 2015, AAAI.
[3] Zizhuo Wang,et al. On the Relation Between Several Discrete Choice Models , 2015 .
[4] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[5] Garud Iyengar,et al. Robust Dynamic Programming , 2005, Math. Oper. Res..
[6] Yuval Tassa,et al. Relative Entropy Regularized Policy Iteration , 2018, ArXiv.
[7] Victor Aguirregabiria,et al. Dynamic Discrete Choice Structural Models: A Survey , 2010, SSRN Electronic Journal.
[8] Kyungjae Lee,et al. Sparse Markov Decision Processes With Causal Sparse Tsallis Entropy Regularization for Reinforcement Learning , 2018, IEEE Robotics and Automation Letters.
[9] Roy Fox,et al. Taming the Noise in Reinforcement Learning via Soft Updates , 2015, UAI.
[10] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[11] J. Steele. Probability theory and combinatorial optimization , 1987 .
[12] Sergey Levine,et al. A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models , 2016, ArXiv.
[13] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[14] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[15] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[16] Hilbert J. Kappen,et al. Dynamic policy programming , 2010, J. Mach. Learn. Res..
[17] Chung-Piaw Teo,et al. Persistency Model and Its Applications in Choice Modeling , 2009, Manag. Sci..
[18] John Rust. Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher , 1987 .
[19] Matthieu Geist,et al. A Theory of Regularized Markov Decision Processes , 2019, ICML.
[20] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[21] A. Anas. Discrete choice theory, information theory and the multinomial logit and gravity models , 1983 .
[22] Ofir Nachum,et al. Path Consistency Learning in Tsallis Entropy Regularized MDPs , 2018, ICML.
[23] Dimitris Bertsimas,et al. Persistence in discrete optimization under data uncertainty , 2006, Math. Program..
[24] Xiaobo Li,et al. On Theoretical and Empirical Aspects of Marginal Distribution Choice Models , 2014, Manag. Sci..
[25] Anind K. Dey,et al. Modeling Interaction via the Principle of Maximum Causal Entropy , 2010, ICML.
[26] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[27] Sergey Levine,et al. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.
[28] John Rust. Maximum likelihood estimation of discrete control processes , 1988 .
[29] Martin A. Riedmiller,et al. Robust Reinforcement Learning for Continuous Control with Model Misspecification , 2019, ICLR.
[30] E. Altman. Constrained Markov Decision Processes , 1999 .
[31] J. Kadane. Structural Analysis of Discrete Data with Econometric Applications , 1984 .
[32] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.