An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient
暂无分享,去创建一个
[1] K. Rezaee,et al. Benchmarking Constraint Inference in Inverse Reinforcement Learning , 2022, ICLR.
[2] Shie Mannor,et al. Efficient Risk-Averse Reinforcement Learning , 2022, NeurIPS.
[3] Matthijs T. J. Spaan,et al. WCSAC: Worst-Case Soft Actor Critic for Safety-Constrained Reinforcement Learning , 2021, AAAI.
[4] Fan Zhou,et al. Non-decreasing Quantile Function Network with Efficient Exploration for Distributional Reinforcement Learning , 2021, IJCAI.
[5] A. Aghasi,et al. Inverse Constrained Reinforcement Learning , 2020, International Conference on Machine Learning.
[6] Mingyuan Zhou,et al. Implicit Distributional Reinforcement Learning , 2020, NeurIPS.
[7] Shimon Whiteson,et al. Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning , 2020, AAAI.
[8] Marcello Restelli,et al. Risk-Averse Trust Region Optimization for Reward-Volatility Reduction , 2019, IJCAI.
[9] Ruslan Salakhutdinov,et al. Worst Cases Policy Gradients , 2019, CoRL.
[10] G. Willmot,et al. Characterization, Robustness and Aggregation of Signed Choquet Integrals , 2019 .
[11] Tatsuya Mori,et al. Learning Robust Options by Conditional Value at Risk Optimization , 2019, NeurIPS.
[12] Mohammad Naghshvar,et al. Risk-averse Behavior Planning for Autonomous Driving under Uncertainty , 2018, ArXiv.
[13] Bo Liu,et al. A Block Coordinate Ascent Algorithm for Mean-Variance Optimization , 2018, NeurIPS.
[14] Jun Cai,et al. Convex Risk Functionals: Representation and Applications , 2018, Insurance: Mathematics and Economics.
[15] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[16] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[17] Marc G. Bellemare,et al. Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.
[18] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[19] Balaraman Ravindran,et al. EPOpt: Learning Robust Neural Network Policies Using Model Ensembles , 2016, ICLR.
[20] Ruodu Wang,et al. Gini-Type Measures of Risk and Variability: Gini Shortfall, Capital Allocations, and Heavy-Tailed Risks , 2016 .
[21] Marco Pavone,et al. Risk-Constrained Reinforcement Learning with Percentile Risk Criteria , 2015, J. Mach. Learn. Res..
[22] Shie Mannor,et al. Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach , 2015, NIPS.
[23] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[24] Michael I. Jordan,et al. Trust Region Policy Optimization , 2015, ICML.
[25] Mohammad Ghavamzadeh,et al. Algorithms for CVaR Optimization in MDPs , 2014, NIPS.
[26] Shie Mannor,et al. Optimizing the CVaR via Sampling , 2014, AAAI.
[27] X. Zhou,et al. MEAN–VARIANCE PORTFOLIO OPTIMIZATION WITH STATE‐DEPENDENT RISK AVERSION , 2014 .
[28] Mohammad Ghavamzadeh,et al. Actor-Critic Algorithms for Risk-Sensitive MDPs , 2013, NIPS.
[29] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[30] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2012, J. Mach. Learn. Res..
[31] Shie Mannor,et al. Policy Gradients with Variance Related Risk Criteria , 2012, ICML.
[32] John N. Tsitsiklis,et al. Mean-Variance Optimization in Markov Decision Processes , 2011, ICML.
[33] Arjun K. Gupta,et al. Convex Ordering of Random Variables and its Applications in Econometrics and Actuarial Science , 2010 .
[34] Bogdan Grechuk,et al. Maximum Entropy Principle with General Deviation Measures , 2009, Math. Oper. Res..
[35] Stan Uryasev,et al. Generalized deviations in risk analysis , 2004, Finance Stochastics.
[36] Vivek S. Borkar,et al. Q-Learning for Risk-Sensitive Control , 2002, Math. Oper. Res..
[37] Duan Li,et al. Optimal Dynamic Portfolio Selection: Multiperiod Mean‐Variance Formulation , 2000 .
[38] Philippe Artzner,et al. Coherent Measures of Risk , 1999 .
[39] M. Grabisch. The application of fuzzy integrals in multicriteria decision making , 1996 .
[40] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[41] W. Sharpe,et al. Mean-Variance Analysis in Portfolio Choice and Capital Markets , 1987 .
[42] M. J. Sobel. The variance of discounted Markov decision processes , 1982, Journal of Applied Probability.
[43] M. Rothschild,et al. Increasing risk: I. A definition , 1970 .
[44] G. J. Glasser. Variance Formulas for the Mean Difference and Coefficient of Concentration , 1962 .
[45] P. Poupart,et al. Distributional Reinforcement Learning with Monotonic Splines , 2022, ICLR.
[46] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[47] Shlomo Yitzhaki,et al. Gini’s Mean difference: a superior measure of variability for non-normal distributions , 2003 .
[48] S. Kusuoka. On law invariant coherent risk measures , 2001 .
[49] G. Choquet. Theory of capacities , 1954 .
[50] C. Gini. Variabilità e mutabilità : contributo allo studio delle distribuzioni e delle relazioni statistiche , 1912 .
[51] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .