Least Absolute Policy Iteration-A Robust Approach to Value Function Approximation
暂无分享,去创建一个
Masashi Sugiyama | Hisashi Kashima | Tetsuro Morimura | Hirotaka Hachiya | Masashi Sugiyama | H. Kashima | Tetsuro Morimura | H. Hachiya | Hirotaka Hachiya
[1] Matthias Heger,et al. Consideration of risk in reinformance learning , 1994, ICML 1994.
[2] E. Newport,et al. Science Current Directions in Psychological Statistical Learning : from Acquiring Specific Items to Forming General Rules on Behalf Of: Association for Psychological Science , 2022 .
[3] Mathematical and Computing Sciences , 2004 .
[4] B. Ripley,et al. Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.
[5] Sanjoy Dasgupta,et al. Off-Policy Temporal Difference Learning with Function Approximation , 2001, ICML.
[6] Philippe Artzner,et al. Coherent Measures of Risk , 1999 .
[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[8] R. Rockafellar,et al. Conditional Value-at-Risk for General Loss Distributions , 2001 .
[9] F. Girosi,et al. Networks for approximation and learning , 1990, Proc. IEEE.
[10] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[11] Shun-ichi Amari,et al. A Theory of Adaptive Pattern Classifiers , 1967, IEEE Trans. Electron. Comput..
[12] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..
[13] Arthur E. Hoerl,et al. Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.
[14] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[15] Makoto Sato,et al. TD algorithm for the variance of return and mean-variance reinforcement learning , 2001 .
[16] Douglas C. Hittle,et al. Robust reinforcement learning control with static and dynamic stability , 2001 .
[17] G. Wahba. Spline models for observational data , 1990 .
[18] Douglas C. Hittle,et al. Robust Reinforcement Learning Control Using Integral Quadratic Constraints for Recurrent Neural Networks , 2007, IEEE Transactions on Neural Networks.
[19] David R. Musicant,et al. Robust Linear and Support Vector Regression , 2000, IEEE Trans. Pattern Anal. Mach. Intell..
[20] Ayhan Demiriz,et al. Linear Programming Boosting via Column Generation , 2002, Machine Learning.
[21] Tomaso A. Poggio,et al. Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..
[22] Ralph Neuneier,et al. Risk-Sensitive Reinforcement Learning , 1998, Machine Learning.
[23] Peter J. Rousseeuw,et al. Robust regression and outlier detection , 1987 .
[24] Jun Morimoto,et al. Robust Reinforcement Learning , 2005, Neural Computation.
[25] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[26] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[27] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[28] Masashi Sugiyama,et al. Adaptive importance sampling for value function approximation in off-policy reinforcement learning , 2009, Neural Networks.
[29] Peter M. Williams,et al. Bayesian Regularization and Pruning Using a Laplace Prior , 1995, Neural Computation.
[30] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.