Proposal Approximate Dynamic Programming Using Bellman Residual Elimination
暂无分享,去创建一个
[1] Jonathan P. How,et al. Approximate dynamic programming using Bellman residual elimination and Gaussian process regression , 2009, 2009 American Control Conference.
[2] Jonathan P. How,et al. Robust adaptive Markov Decision Processes in multi-vehicle applications , 2009, 2009 American Control Conference.
[3] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[4] Dimitri P. Bertsekas,et al. Neuro-Dynamic Programming , 2009, Encyclopedia of Optimization.
[5] Jonathan P. How,et al. Approximate dynamic programming using support vector regression , 2008, 2008 47th IEEE Conference on Decision and Control.
[6] Masashi Sugiyama,et al. Geodesic Gaussian kernels for value function approximation , 2008, Auton. Robots.
[7] Jonathan P. How,et al. Experimental Demonstration of Adaptive MDP-Based Planning with Model Uncertainty , 2008 .
[8] Risto Miikkulainen,et al. Online kernel selection for Bayesian reinforcement learning , 2008, ICML '08.
[9] B. Bethke,et al. Group health management of UAV teams with applications to persistent surveillance , 2008, 2008 American Control Conference.
[10] Csaba Szepesvári,et al. Finite-Time Bounds for Fitted Value Iteration , 2008, J. Mach. Learn. Res..
[11] B. Bethke,et al. Real-time indoor autonomous vehicle test environment , 2008, IEEE Control Systems.
[12] Mikhail Belkin,et al. Towards a theoretical foundation for Laplacian-based manifold methods , 2005, J. Comput. Syst. Sci..
[13] B. Bethke. Kernel-Based Reinforcement Learning Using Bellman Residual Elimination , 2008 .
[14] Panos M. Pardalos,et al. Advances in Cooperative Control and Optimization , 2008 .
[15] Richard M. Murray,et al. Recent Research in Cooperative Control of Multivehicle Systems , 2007 .
[16] Jason Weston,et al. Large-scale kernel machines , 2007 .
[17] Jonathan P. How,et al. Mission Health Management for 24/7 Persistent Surveillance Operations , 2007 .
[18] Jonathan P. How,et al. Embedding Health Management into Mission Tasking for UAV Teams , 2007, 2007 American Control Conference.
[19] Masashi Sugiyama,et al. Value Function Approximation on Non-Linear Manifolds for Robot Motor Control , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.
[20] Csaba Szepesvári,et al. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path , 2006, Machine Learning.
[21] Mario J. Valenti. Approximate dynamic programming with applications in multi-agent systems , 2007 .
[22] Liming Xiang,et al. Kernel-Based Reinforcement Learning , 2006, ICIC.
[23] Daniel Polani,et al. Least Squares SVM for Least Squares TD Learning , 2006, ECAI.
[24] Sridhar Mahadevan,et al. Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions , 2005, NIPS.
[25] Carl E. Rasmussen,et al. A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..
[26] Sridhar Mahadevan,et al. Proto-value functions: developmental reinforcement learning , 2005, ICML.
[27] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.
[28] Jian-xiong Dong,et al. Fast SVM training algorithm with decomposition on very large data sets , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[29] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[30] Andrew G. Barto,et al. Linear Least-Squares Algorithms for Temporal Difference Learning , 2005, Machine Learning.
[31] Yaakov Engel,et al. Algorithms and representations for reinforcement learning (עם תקציר בעברית, תכן ושער נוסף: אלגוריתמים וייצוגים ללמידה מחיזוקים.; אלגוריתמים וייצוגים ללמידה מחיזוקים.) , 2005 .
[32] Anthony Widjaja,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.
[33] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[34] Benjamin Van Roy,et al. A Cost-Shaping LP for Bellman Error Minimization with Performance Guarantees , 2004, NIPS.
[35] Bernhard Schölkopf,et al. A tutorial on support vector regression , 2004, Stat. Comput..
[36] Mikhail Belkin,et al. Semi-Supervised Learning on Riemannian Manifolds , 2004, Machine Learning.
[37] Justin A. Boyan,et al. Technical Update: Least-Squares Temporal Difference Learning , 2002, Machine Learning.
[38] Gerald Tesauro,et al. Practical issues in temporal difference learning , 1992, Machine Learning.
[39] William D. Smart. Explicit Manifold Representations for Value-Function Approximation in Reinforcement Learning , 2004, ISAIM.
[40] Carl E. Rasmussen,et al. Gaussian Processes in Reinforcement Learning , 2003, NIPS.
[41] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[42] George L. Nemhauser,et al. Rerouting Aircraft for Airline Recovery , 2003, Transp. Sci..
[43] Benjamin Van Roy,et al. The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..
[44] Eric Bonabeau,et al. Control of UAV Swarms: What the Bugs Can Teach Us , 2003 .
[45] H. Van Dyke Parunak,et al. Swarming Coordination of Multiple UAV's for Collaborative Sensing , 2003 .
[46] Shie Mannor,et al. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.
[47] Charles Stark. Optimization-Based Analysis of Collaborative Airport Arrival Planning , 2003 .
[48] Dimitri P. Bertsekas,et al. Least Squares Policy Evaluation Algorithms with Linear Function Approximation , 2003, Discret. Event Dyn. Syst..
[49] Joseph C. Hartman,et al. The series–parallel replacement problem , 2002 .
[50] Benjamin Van Roy,et al. The linear programming approach to approximate dynamic programming: theory and application , 2002 .
[51] Benjamin Van Roy,et al. Approximate Linear Programming for Average-Cost Dynamic Programming , 2002, NIPS.
[52] Carl E. Rasmussen,et al. Derivative Observations in Gaussian Process Models of Dynamic Systems , 2002, NIPS.
[53] Felipe Cucker,et al. On the mathematical foundations of learning , 2001 .
[54] S. Sathiya Keerthi,et al. Improvements to Platt's SMO Algorithm for SVM Classifier Design , 2001, Neural Computation.
[55] Gunnar Rätsch,et al. An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.
[56] Xin Wang,et al. Batch Value Function Approximation via Support Vectors , 2001, NIPS.
[57] Kristin P. Bennett,et al. Support vector machines: hype or hallelujah? , 2000, SKDD.
[58] Bernhard Schölkopf,et al. New Support Vector Algorithms , 2000, Neural Computation.
[59] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[60] Vladimir Vapnik,et al. An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.
[61] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .
[62] Federico Girosi,et al. An Equivalence Between Sparse Approximation and Support Vector Machines , 1998, Neural Computation.
[63] Alexander Gammerman,et al. Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.
[64] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .
[65] Dimitris Bertsimas,et al. The Air Traffic Flow Management Problem with Enroute Capacities , 1998, Oper. Res..
[66] Ram Gopalan,et al. The Aircraft Maintenance Routing Problem , 1998, Oper. Res..
[67] Cynthia Barnhart,et al. Integrated Airline Schedule Planning , 1998 .
[68] Federico Girosi,et al. An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.
[69] George L. Nemhauser,et al. The aircraft rotation problem , 1997, Ann. Oper. Res..
[70] Bernhard Schölkopf,et al. Support vector learning , 1997 .
[71] S. Ioffe,et al. Temporal Differences-Based Policy Iteration and Applications in Neuro-Dynamic Programming , 1996 .
[72] Bernhard Schölkopf,et al. Extracting Support Data for a Given Task , 1995, KDD.
[73] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[74] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[75] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[76] Vladimir Vapnik,et al. The Nature of Statistical Learning , 1995 .
[77] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[78] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.
[79] G. Wahba. Spline Models for Observational Data , 1990 .
[80] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[81] V. Borkar. A convex analytic approach to Markov decision processes , 1988 .
[82] 齋藤 三郎,et al. Theory of reproducing kernels and its applications , 1988 .
[83] Saburou Saitoh,et al. Theory of Reproducing Kernels and Its Applications , 1988 .
[84] P. Schweitzer,et al. Generalized polynomial approximations in Markovian decision processes , 1985 .
[85] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[86] A. Hordijk,et al. Linear Programming and Markov Decision Chains , 1979 .
[87] G. Hardy,et al. Ramanujan: Twelve Lectures on Subjects Suggested by His Life and Work , 1978 .
[88] E. Denardo. On Linear Programming in a Markov Decision Problem , 1970 .
[89] J. Williamson. Harmonic Analysis on Semigroups , 1967 .
[90] A. S. Manne. Linear Programming and Sequential Decisions , 1960 .
[91] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[92] R Bellman,et al. On the Theory of Dynamic Programming. , 1952, Proceedings of the National Academy of Sciences of the United States of America.
[93] N. Aronszajn. Theory of Reproducing Kernels. , 1950 .
[94] Claude E. Shannon,et al. Programming a computer for playing chess , 1950 .