Apprenticeship learning and reinforcement learning with application to robotic control
暂无分享,去创建一个
[1] R. E. Kalman,et al. A New Approach to Linear Filtering and Prediction Problems , 2002 .
[2] A. S. Manne. Linear Programming and Sequential Decisions , 1960 .
[3] R. Tyrrell Rockafellar,et al. Convex Analysis , 1970, Princeton Landmarks in Mathematics and Physics.
[4] S. B. Needleman,et al. A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.
[5] David Q. Mayne,et al. Differential dynamic programming , 1972, The Mathematical Gazette.
[6] J. K. Satia,et al. Markovian Decision Processes with Uncertain Transition Probabilities , 1973, Oper. Res..
[7] Arthur Gelb,et al. Applied Optimal Estimation , 1974 .
[8] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Vol. II , 1976 .
[9] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[10] S. Chiba,et al. Dynamic programming algorithm optimization for spoken word recognition , 1978 .
[11] E. J. Lefferts,et al. Kalman Filtering for Spacecraft Attitude Estimation , 1982 .
[12] N. Hogan. An organizing principle for a class of voluntary movements , 1984, The Journal of neuroscience : the official journal of the Society for Neuroscience.
[13] Lennart Ljung,et al. System Identification: Theory for the User , 1987 .
[14] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[15] 宇野 洋二,et al. Formation and control of optimal trajectory in human multijoint arm movement : minimum torque-change model , 1988 .
[16] Christopher G. Atkeson,et al. Model-Based Control of a Robot Manipulator , 1988 .
[17] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[18] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[19] B. Anderson,et al. Optimal control: linear quadratic methods , 1990 .
[20] R. Durrett. Probability: Theory and Examples , 1993 .
[21] Simon Newman,et al. Basic Helicopter Aerodynamics , 1990 .
[22] David Williams,et al. Probability with Martingales , 1991, Cambridge mathematical textbooks.
[23] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[24] Mark B. Tischler,et al. Frequency-Response Method for Rotorcraft System Identification: Flight Applications to BO 105 Coupled Rotor/Fuselage Dynamics , 1992 .
[25] L. Jones. A Simple Lemma on Greedy Approximation in Hilbert Space and Convergence Rates for Projection Pursuit Regression and Neural Network Training , 1992 .
[26] Frank L. Lewis,et al. Aircraft Control and Simulation , 1992 .
[27] T D Gillespie,et al. Fundamentals of Vehicle Dynamics , 1992 .
[28] Chelsea C. White,et al. Markov Decision Processes with Imprecise Transition Probabilities , 1994, Oper. Res..
[29] Masayuki Inaba,et al. Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..
[30] Hermann Ney,et al. On structuring probabilistic dependences in stochastic language modelling , 1994, Comput. Speech Lang..
[31] Stefan Schaal,et al. Robot learning by nonparametric regression , 1994, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'94).
[32] Gillian M. Hayes,et al. A Robot Controller Using Learning by Imitation , 1994 .
[33] Alan J. Laub,et al. The LMI control toolbox , 1994, Proceedings of 1994 33rd IEEE Conference on Decision and Control.
[34] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[35] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[36] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[37] Craig Boutilier,et al. Context-Specific Independence in Bayesian Networks , 1996, UAI.
[38] J. Doyle,et al. Robust and optimal control , 1995, Proceedings of 35th IEEE Conference on Decision and Control.
[39] P. Spreij. Probability and Measure , 1996 .
[40] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.
[41] Doina Precup,et al. Theoretical Results on Reinforcement Learning with Temporally Abstract Options , 1998, ECML.
[42] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[43] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[44] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[45] Geoffrey E. Hinton,et al. A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.
[46] S. King. Learning to fly. , 1998, Nursing Times.
[47] Takeo Kanade,et al. System identification of small-size unmanned helicopter dynamics , 1999 .
[48] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[49] Kevin L. Moore,et al. Iterative Learning Control: An Expository Overview , 1999 .
[50] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[51] Michael Kearns,et al. Efficient Reinforcement Learning in Factored MDPs , 1999, IJCAI.
[52] J. Gordon Leishman,et al. Principles of Helicopter Aerodynamics , 2000 .
[53] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[54] Leslie Pack Kaelbling,et al. Practical Reinforcement Learning in Continuous Spaces , 2000, ICML.
[55] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[56] Jeff G. Schneider,et al. Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).
[57] Michael I. Jordan,et al. On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.
[58] Jun Morimoto,et al. Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning , 2000, Robotics Auton. Syst..
[59] Eric Feron,et al. Control Logic for Automated Aerobatic Flight of a Miniature Helicopter , 2002 .
[60] Bernard Mettler,et al. Flight test and simulation results for an autonomous aerobatic helicopter , 2002, Proceedings. The 21st Digital Avionics Systems Conference.
[61] R. Amit,et al. Learning movement sequences from demonstration , 2002, Proceedings 2nd International Conference on Development and Learning. ICDL 2002.
[62] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[63] Jun Morimoto,et al. Minimax Differential Dynamic Programming: An Application to Robust Biped Walking , 2002, NIPS.
[64] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.
[65] Gaurav S. Sukhatme,et al. Visually guided landing of an unmanned aerial vehicle , 2003, IEEE Trans. Robotics Autom..
[66] John Langford,et al. Exploration in Metric State Spaces , 2003, ICML.
[67] Peter I. Corke,et al. Low-cost flight control system for a small autonomous helicopter , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).
[68] M. Kawato,et al. Formation and control of optimal trajectory in human multijoint arm movement , 1989, Biological Cybernetics.
[69] Radford M. Neal,et al. Multiple Alignment of Continuous Time Series , 2004, NIPS.
[70] Pieter Abbeel,et al. Learning first-order Markov models for control , 2004, NIPS.
[71] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[72] Andrew W. Moore,et al. Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.
[73] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[74] Michael I. Jordan,et al. Mixed Memory Markov Models: Decomposing Complex Stochastic Processes as Mixtures of Simpler Ones , 1999, Machine Learning.
[75] Sham M. Kakade,et al. Online Bounds for Bayesian Algorithms , 2004, NIPS.
[76] Eric Feron,et al. Human-Inspired Control Logic for Automated Maneuvering of Miniature Helicopter , 2004 .
[77] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[78] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[79] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.
[80] Laurent El Ghaoui,et al. Robust Solutions to Markov Decision Problems with Uncertain Transition Matrices , 2005 .
[81] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[82] Pieter Abbeel,et al. Learning vehicular dynamics, with application to modeling helicopters , 2005, NIPS.
[83] G. Dullerud,et al. A Course in Robust Control Theory: A Convex Approach , 2005 .
[84] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[85] Pieter Abbeel,et al. Using inaccurate models in reinforcement learning , 2006, ICML.
[86] David M. Bradley,et al. Boosting Structured Prediction for Imitation Learning , 2006, NIPS.
[87] Robert E. Schapire,et al. A Game-Theoretic Approach to Apprenticeship Learning , 2007, NIPS.
[88] Aude Billard,et al. On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[89] Pieter Abbeel,et al. Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion , 2007, NIPS.
[90] J. Listgarten. Analysis of sibling time series data: Alignment and difference detection , 2007 .
[91] Csaba Szepesvári,et al. Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods , 2007, UAI.
[92] Eyal Amir,et al. Bayesian Inverse Reinforcement Learning , 2007, IJCAI.
[93] Sebastian Thrun,et al. Apprenticeship learning for motion planning with application to parking lot navigation , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[94] Andrew Y. Ng,et al. A control architecture for quadruped locomotion over rough terrain , 2008, 2008 IEEE International Conference on Robotics and Automation.
[95] Dirk Haehnel,et al. Junior: The Stanford entry in the Urban Challenge , 2008 .
[96] Sebastian Thrun,et al. Path Planning for Autonomous Driving in Unknown Environments , 2008, ISER.
[97] Pieter Abbeel,et al. Learning for control from multiple demonstrations , 2008, ICML '08.
[98] Sebastian Thrun,et al. Junior: The Stanford entry in the Urban Challenge , 2008, J. Field Robotics.