Variable risk control via stochastic optimization
暂无分享,去创建一个
[1] Milton Abramowitz,et al. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .
[2] Harold J. Kushner,et al. A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise , 1964 .
[3] M. Abramowitz,et al. Handbook of Mathematical Functions With Formulas, Graphs and Mathematical Tables (National Bureau of Standards Applied Mathematics Series No. 55) , 1965 .
[4] Rhodes,et al. Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games , 1973 .
[5] P. Whittle. Risk-sensitive linear/quadratic/gaussian control , 1981, Advances in Applied Probability.
[6] Francis L. Merat,et al. Introduction to robotics: Mechanics and control , 1987, IEEE J. Robotics Autom..
[7] R. Tibshirani,et al. Local Likelihood Estimation , 1987 .
[8] P. Whittle. Risk-Sensitive Optimal Control , 1990 .
[9] Marwan A. Jabri,et al. Weight perturbation: an optimal architecture and learning technique for analog VLSI feedforward and recurrent multilayer networks , 1992, IEEE Trans. Neural Networks.
[10] D. Dennis,et al. A statistical method for global optimization , 1992, [Proceedings] 1992 IEEE International Conference on Systems, Man, and Cybernetics.
[11] Matthias Heger,et al. Consideration of risk in reinformance learning , 1994, ICML 1994.
[12] Matthias Heger,et al. Consideration of Risk in Reinforcement Learning , 1994, ICML.
[13] A. Kacelnik,et al. Risky Theories—The Effects of Variance on Foraging Decisions , 1996 .
[14] Daniel Hernández-Hernández,et al. Risk Sensitive Markov Decision Processes , 1997 .
[15] Paul W. Goldberg,et al. Regression with Input-dependent Noise: A Gaussian Process Treatment , 1997, NIPS.
[16] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[17] Donald R. Jones,et al. Global versus local search in constrained optimization of computer models , 1998 .
[18] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[19] John N. Tsitsiklis,et al. Gradient Convergence in Gradient methods with Errors , 1999, SIAM J. Optim..
[20] Andrew G. Barto,et al. Robot Weightlifting By Direct Policy Search , 2001, IJCAI.
[21] Donald R. Jones,et al. A Taxonomy of Global Optimization Methods Based on Response Surfaces , 2001, J. Glob. Optim..
[22] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[23] Vivek S. Borkar,et al. Q-Learning for Risk-Sensitive Control , 2002, Math. Oper. Res..
[24] M. Bateson. Recent advances in our understanding of risk-sensitive foraging preferences , 2002, Proceedings of the Nutrition Society.
[25] Isaac Meilijson,et al. Evolution of Reinforcement Learning in Uncertain Environments: A Simple Explanation for Complex Foraging Behaviors , 2002, Adapt. Behav..
[26] Ralph Neuneier,et al. Risk-Sensitive Reinforcement Learning , 1998, Machine Learning.
[27] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[28] H. Sebastian Seung,et al. Stochastic policy gradient reinforcement learning on a simple 3D biped , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).
[29] Peter Stone,et al. Machine Learning for Fast Quadrupedal Locomotion , 2004, AAAI.
[30] Zoubin Ghahramani,et al. Variable Noise and Dimensionality Reduction for Sparse Gaussian processes , 2006, UAI.
[31] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[32] J. O'Doherty,et al. Reward Value Coding Distinct From Risk Attitude-Related Uncertainty Coding in Human Reward Systems , 2006, Journal of neurophysiology.
[33] Nando de Freitas,et al. Active Policy Learning for Robot Planning and Exploration under Uncertainty , 2007, Robotics: Science and Systems.
[34] Jack L. Treynor,et al. MUTUAL FUND PERFORMANCE* , 2007 .
[35] Tao Wang,et al. Automatic Gait Optimization with Gaussian Process Regression , 2007, IJCAI.
[36] Wolfram Burgard,et al. Most likely heteroscedastic Gaussian process regression , 2007, ICML '07.
[37] Russ Tedrake,et al. Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms , 2008, NIPS.
[38] Marcus R. Frean,et al. Using Gaussian Processes to Optimize Expensive Functions , 2008, Australasian Conference on Artificial Intelligence.
[39] S. Quartz,et al. Human Insula Activation Reflects Risk Prediction Errors As Well As Risk , 2008, The Journal of Neuroscience.
[40] Michael A. Osborne,et al. Gaussian Processes for Global Optimization , 2008 .
[41] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[42] Christian Laugier,et al. The International Journal of Robotics Research (IJRR) - Special issue on ``Field and Service Robotics '' , 2009 .
[43] Andrew Y. Ng,et al. Policy search via the signed derivative , 2009, Robotics: Science and Systems.
[44] Nando de Freitas,et al. A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot , 2009, Auton. Robots.
[45] L. Maloney,et al. Economic decision-making compared with an equivalent motor task , 2009, Proceedings of the National Academy of Sciences.
[46] Scott Kuindersma,et al. Dexterous mobility with the uBot-5 mobile manipulator , 2009, 2009 International Conference on Advanced Robotics.
[47] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.
[48] Stefan Schaal,et al. Reinforcement learning of motor skills in high dimensions: A path integral approach , 2010, 2010 IEEE International Conference on Robotics and Automation.
[49] Masashi Sugiyama,et al. Nonparametric Return Distribution Approximation for Reinforcement Learning , 2010, ICML.
[50] E. Vázquez,et al. Convergence properties of the expected improvement algorithm with fixed mean and covariance functions , 2007, 0712.3744.
[51] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[52] Jun Zhang,et al. Motor Learning at Intermediate Reynolds Number: Experiments with Policy Gradient on the Flapping Flight of a Rigid Wing , 2010, From Motor Learning to Interaction Learning in Robots.
[53] Roman Garnett,et al. Bayesian optimization for sensor set selection , 2010, IPSN '10.
[54] Masashi Sugiyama,et al. Parametric Return Density Estimation for Reinforcement Learning , 2010, UAI.
[55] Roderic A. Grupen,et al. Whole-body strategies for mobility and manipulation , 2010 .
[56] Marc Peter Deisenroth,et al. Efficient reinforcement learning using Gaussian processes , 2010 .
[57] Daniel A. Braun,et al. Risk-Sensitive Optimal Feedback Control Accounts for Sensorimotor Behavior under Uncertainty , 2010, PLoS Comput. Biol..
[58] Hilbert J. Kappen,et al. Risk Sensitive Path Integral Control , 2010, UAI.
[59] Nando de Freitas,et al. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.
[60] Olivier Sigaud,et al. From Motor Learning to Interaction Learning in Robots , 2010, From Motor Learning to Interaction Learning in Robots.
[61] Andrew Gordon Wilson,et al. Generalised Wishart Processes , 2010, UAI.
[62] Daniel A. Braun,et al. Risk-Sensitivity in Sensorimotor Control , 2011, Front. Hum. Neurosci..
[63] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[64] Daniel A. Braun,et al. Risk-sensitivity and the mean-variance trade-off: decision making in sensorimotor control , 2011, Proceedings of the Royal Society B: Biological Sciences.
[65] Adam D. Bull,et al. Convergence Rates of Efficient Global Optimization Algorithms , 2011, J. Mach. Learn. Res..
[66] Howie Choset,et al. Using response surfaces and expected improvement to optimize snake robot gait parameters , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[67] Alan Fern,et al. A Behavior Based Kernel for Policy Search via Bayesian Optimization , 2011 .
[68] Miguel Lázaro-Gredilla,et al. Variational Heteroscedastic Gaussian Process Regression , 2011, ICML.
[69] Scott Kuindersma,et al. Learning dynamic arm motions for postural recovery , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.
[70] D. Lizotte,et al. An experimental methodology for response surface optimization methods , 2012, J. Glob. Optim..
[71] Olivier Sigaud,et al. Path Integral Policy Improvement with Covariance Matrix Adaptation , 2012, ICML.
[72] S. Kakade,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2012, IEEE Transactions on Information Theory.
[73] A. Barto,et al. Variable Risk Dynamic Mobile Manipulation , 2012 .
[74] Shie Mannor,et al. Policy Gradients with Variance Related Risk Criteria , 2012, ICML.
[75] P. Dayan,et al. Neural Prediction Errors Reveal a Risk-Sensitive Reinforcement-Learning Process in the Human Brain , 2012, The Journal of Neuroscience.
[76] Darwin G. Caldwell,et al. Direct policy search reinforcement learning based on particle filtering , 2012, EWRL 2012.
[77] Scott Kuindersma,et al. Variational Bayesian Optimization for Runtime Risk-Sensitive Control , 2012, Robotics: Science and Systems.
[78] Klaus Obermayer,et al. Risk-Sensitive Reinforcement Learning , 2013, Neural Computation.