Learning decisions: robustness, uncertainty, and approximation
暂无分享,去创建一个
[1] G. D. Liveing,et al. The University of Cambridge , 1897, British medical journal.
[2] Parag A. Pathak,et al. Massachusetts Institute of Technology , 1964, Nature.
[3] M. Degroot. Optimal Statistical Decisions , 1970 .
[4] T. Başar,et al. Dynamic Noncooperative Game Theory , 1982 .
[5] Suguru Arimoto,et al. Bettering operation of Robots by learning , 1984, J. Field Robotics.
[6] Peter C. Cheeseman,et al. In Defense of Probability , 1985, IJCAI.
[7] Peter W. Glynn,et al. Proceedings of Ihe 1986 Winter Simulation , 2022 .
[8] Editors , 1986, Brain Research Bulletin.
[9] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[10] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[11] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[12] Andrew W. Moore,et al. Efficient memory-based learning for robot control , 1990 .
[13] Christopher G. Atkeson,et al. Using locally weighted regression for robot learning , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.
[14] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[15] Andrew W. Moore,et al. Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation , 1993, NIPS.
[16] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[17] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[18] Matthias Heger,et al. Consideration of Risk in Reinforcement Learning , 1994, ICML.
[19] Andrew W. Moore,et al. Memory-based Stochastic Optimization , 1995, NIPS.
[20] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[21] Csaba Szepesvári,et al. A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.
[22] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[23] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[24] Jeff G. Schneider,et al. Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning , 1996, NIPS.
[25] W. Fleming,et al. Risk-Sensitive Control of Finite State Machines on an Infinite Horizon I , 1997 .
[26] John D. Lafferty,et al. Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..
[27] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[28] Edoardo Amaldi,et al. On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..
[29] P. Bartlett,et al. Direct Gradient-Based Reinforcement Learning: II. Gradient Ascent Algorithms and Experiments , 1999 .
[30] Takeo Kanade,et al. System identification of small-size unmanned helicopter dynamics , 1999 .
[31] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[32] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[33] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[34] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[35] P. Bartlett,et al. Direct Gradient-Based Reinforcement Learning: I. Gradient Estimation Algorithms , 1999 .
[36] Wolfram Burgard,et al. Experiences with an Interactive Museum Tour-Guide Robot , 1999, Artif. Intell..
[37] Steven M. LaValle,et al. Rapidly-Exploring Random Trees: Progress and Prospects , 2000 .
[38] Peter W. Glynn,et al. Kernel-Based Reinforcement Learning in Average-Cost Problems: An Application to Optimal Portfolio Choice , 2000, NIPS.
[39] Shun-ichi Amari,et al. Methods of information geometry , 2000 .
[40] Robert Givan,et al. Bounded-parameter Markov decision processes , 2000, Artif. Intell..
[41] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[42] Andrew W. Moore,et al. 'N-Body' Problems in Statistical Learning , 2000, NIPS.
[43] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[44] N. Čencov. Statistical Decision Rules and Optimal Inference , 2000 .
[45] Peter L. Bartlett,et al. Functional Gradient Techniques for Combining Hypotheses , 2000 .
[46] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[47] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[48] Carlos Guestrin,et al. Multiagent Planning with Factored MDPs , 2001, NIPS.
[49] Christian R. Shelton,et al. Importance sampling for reinforcement learning with multiple objectives , 2001 .
[50] Jun Morimoto,et al. Nonparametric Representation of Policies and Value Functions: A Trajectory-Based Approach , 2002, NIPS.
[51] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[52] Benjamin Van Roy,et al. Approximate Linear Programming for Average-Cost Dynamic Programming , 2002, NIPS.
[53] Stuart J. Russell,et al. Dynamic bayesian networks: representation, inference and learning , 2002 .
[54] William H. Press,et al. Numerical recipes in C , 2002 .
[55] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.
[56] Chris Urmson,et al. A generic framework for robotic navigation , 2003, 2003 IEEE Aerospace Conference Proceedings (Cat. No.03TH8652).
[57] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.
[58] D. Koller,et al. Planning under uncertainty in complex structured environments , 2003 .
[59] Jeff G. Schneider,et al. Covariant policy search , 2003, IJCAI 2003.
[60] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[61] J. Langford,et al. Reducing T-step reinforcement learning to classifica-tion , 2003 .
[62] Jeff G. Schneider,et al. Policy Search by Dynamic Programming , 2003, NIPS.
[63] Sebastian Thrun,et al. Perspectives on standardization in mobile robot programming: the Carnegie Mellon Navigation (CARMEN) Toolkit , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).
[64] Michail G. Lagoudakis,et al. Reinforcement Learning as Classification: Leveraging Modern Classifiers , 2003, ICML.
[65] Alexander J. Smola,et al. Online learning with kernels , 2001, IEEE Transactions on Signal Processing.
[66] Jan Peters,et al. Reinforcement Learning for Humanoid Robots - Policy Gradients and Beyond , 2004 .
[67] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[68] Volume 16 , 2004, Journal of Clinical Monitoring and Computing.
[69] Sekhar Tatikonda,et al. Control under communication constraints , 2004, IEEE Transactions on Automatic Control.
[70] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[71] M. Littman,et al. Exploration via Model-based Interval Estimation , 2004 .
[72] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[73] R. Schapire. The Strength of Weak Learnability , 1990, Machine Learning.
[74] Jun Morimoto,et al. Robust Reinforcement Learning , 2005, Neural Computation.
[75] Laurent El Ghaoui,et al. Robust Solutions to Markov Decision Problems with Uncertain Transition Matrices , 2005 .
[76] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.