Bayesian Nonparametric Reward Learning From Demonstration
暂无分享,去创建一个
Jonathan P. How | Thomas J. Walsh | Ali-akbar Agha-mohammadi | Bernard Michini | J. How | Ali-akbar Agha-mohammadi | Bernard Michini
[1] Stochastic Relaxation , 2014, Computer Vision, A Reference Guide.
[2] Chris L. Baker,et al. Action understanding as inverse planning , 2009, Cognition.
[3] Michael I. Jordan,et al. Nonparametric Bayesian Learning of Switching Linear Dynamical Systems , 2008, NIPS.
[4] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[5] Alexander Zelinsky,et al. Programing by Demonstration: Coping with Suboptimal Teaching Actions , 2003 .
[6] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[7] Csaba Szepesvári,et al. Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods , 2007, UAI.
[8] Marc Toussaint,et al. Optimization of sequential attractor-based movement for compact behaviour generation , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.
[9] Manuela M. Veloso,et al. Multi-thresholded approach to demonstration selection for interactive robot learning , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[10] Guido Bugmann,et al. Mobile robot programming using natural language , 2002, Robotics Auton. Syst..
[11] Carl E. Rasmussen,et al. Gaussian process dynamic programming , 2009, Neurocomputing.
[12] Illah R. Nourbakhsh,et al. A Preliminary Study of Peer-to-Peer Human-Robot Interaction , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.
[13] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.
[14] Sebastian Thrun. Toward a framework for human-robot interaction , 2004 .
[15] Michael L. Littman,et al. Apprenticeship Learning About Multiple Intentions , 2011, ICML.
[16] Yasuharu Koike,et al. PII: S0893-6080(96)00043-3 , 1997 .
[17] Helge J. Ritter,et al. Situated robot learning for multi-modal instruction and imitation of grasping , 2004, Robotics Auton. Syst..
[18] Scott Kuindersma,et al. Robot learning from demonstration by constructing skill trees , 2012, Int. J. Robotics Res..
[19] Eric R. Ziegel,et al. Practical Nonparametric and Semiparametric Bayesian Statistics , 1998, Technometrics.
[20] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[21] Barbara Majecka,et al. Statistical models of pedestrian behaviour in the Forum , 2009 .
[22] Anind K. Dey,et al. Human Behavior Modeling with Maximum Entropy Inverse Optimal Control , 2009, AAAI Spring Symposium: Human Behavior Modeling.
[23] David B. Dunson,et al. Bayesian Data Analysis , 2010 .
[24] G. Roberts,et al. Updating Schemes, Correlation Structure, Blocking and Parameterization for the Gibbs Sampler , 1997 .
[25] Jan Peters,et al. Movement extraction by detecting dynamics switches and repetitions , 2010, NIPS.
[26] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[27] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[28] P. Damlen,et al. Gibbs sampling for Bayesian non‐conjugate and hierarchical models by using auxiliary variables , 1999 .
[29] Jean Scholtz,et al. Awareness in human-robot interactions , 2003, SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483).
[30] Pradeep K. Khosla,et al. A Multi-Agent System for Programming Robotic Agents by Human Demonstration , 1998 .
[31] R. Bellman. Dynamic programming. , 1957, Science.
[32] Marc Toussaint,et al. Learned graphical models for probabilistic planning provide a new class of movement primitives , 2013, Front. Comput. Neurosci..
[33] Petre Stoica,et al. Decentralized Control , 2018, The Control Systems Handbook.
[34] Christopher D. Wickens,et al. A model for types and levels of human interaction with automation , 2000, IEEE Trans. Syst. Man Cybern. Part A.
[35] Bernhard Schölkopf,et al. Switched Latent Force Models for Movement Segmentation , 2010, NIPS.
[36] Thomas L. Griffiths,et al. The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.
[37] Lehel Csató,et al. Sparse On-Line Gaussian Processes , 2002, Neural Computation.
[38] Radford M. Neal. Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .
[39] Daniel H. Grollman,et al. Dogged Learning for Robots , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.
[40] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[41] Claude Sammut,et al. Learning to Fly , 1992, ML.
[42] Christos Dimitrakakis,et al. Bayesian Multitask Inverse Reinforcement Learning , 2011, EWRL.
[43] Manuela M. Veloso,et al. Confidence-based policy learning from demonstration using Gaussian mixture models , 2007, AAMAS '07.
[44] Chrystopher L. Nehaniv,et al. Teaching robots by moulding behavior and scaffolding the environment , 2006, HRI '06.
[45] Stephen P. Boyd,et al. Linear Matrix Inequalities in Systems and Control Theory , 1994 .
[46] Scott Niekum,et al. Learning and generalization of complex tasks from unstructured demonstrations , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[47] Paul E. Rybski,et al. Interactive task training of a mobile robot through human gesture recognition , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).
[48] J. Berger. Statistical Decision Theory and Bayesian Analysis , 1988 .
[49] Dean Pomerleau,et al. Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.
[50] Jonathan P. How,et al. Bayesian Nonparametric Inverse Reinforcement Learning , 2012, ECML/PKDD.
[51] Yee Whye Teh,et al. Collapsed Variational Inference for HDP , 2007, NIPS.
[52] E. Yaz. Linear Matrix Inequalities In System And Control Theory , 1998, Proceedings of the IEEE.
[53] Jonathan P. How,et al. Actuator Constrained Trajectory Generation and Control for Variable-Pitch Quadrotors , 2012 .
[54] M. Escobar,et al. Bayesian Density Estimation and Inference Using Mixtures , 1995 .
[55] Thomas J. Walsh,et al. Generalizing Apprenticeship Learning across Hypothesis Classes , 2010, ICML.
[56] Michael I. Jordan,et al. Tree-Structured Stick Breaking for Hierarchical Data , 2010, NIPS.
[57] Pamela J. Hinds,et al. Autonomy and Common Ground in Human-Robot Interaction: A Field Study , 2007, IEEE Intelligent Systems.
[58] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.
[59] Manuel Lopes,et al. Active Learning for Reward Estimation in Inverse Reinforcement Learning , 2009, ECML/PKDD.
[60] Jun Morimoto,et al. Learning from demonstration and adaptation of biped locomotion , 2004, Robotics Auton. Syst..
[61] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.
[62] Eyal Amir,et al. Bayesian Inverse Reinforcement Learning , 2007, IJCAI.
[63] B. Bethke,et al. Real-time indoor autonomous vehicle test environment , 2008, IEEE Control Systems.
[64] Leslie Pack Kaelbling,et al. Making Reinforcement Learning Work on Real Robots , 2002 .
[65] J. Pitman. Combinatorial Stochastic Processes , 2006 .
[66] Erik B. Sudderth. Graphical models for visual object recognition and tracking , 2006 .
[67] José María Valls,et al. Correcting and improving imitation models of humans for Robosoccer agents , 2005, 2005 IEEE Congress on Evolutionary Computation.
[68] Nathan Delson,et al. Robot programming by human demonstration: adaptation and inconsistency in constrained motion , 1996, Proceedings of IEEE International Conference on Robotics and Automation.
[69] Pieter Abbeel,et al. Apprenticeship learning for helicopter control , 2009, CACM.
[70] Neil D. Lawrence,et al. Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.
[71] Masayuki Inaba,et al. Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..
[72] Michael I. Jordan,et al. An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.
[73] Nando de Freitas,et al. An Introduction to MCMC for Machine Learning , 2004, Machine Learning.
[74] Emilio Frazzoli,et al. Steady-state cornering equilibria and stabilisation for a vehicle during extreme operating conditions , 2010 .
[75] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[76] Gordon Cheng,et al. Humanoid robot learning and game playing using PC-based vision , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.
[77] W. Cleveland,et al. Smoothing by Local Regression: Principles and Methods , 1996 .
[78] Carl E. Rasmussen,et al. A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..
[79] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[80] Michael A. Goodrich,et al. Seven principles of efficient human robot interaction , 2003, SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483).
[81] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[82] David M. Bradley,et al. Boosting Structured Prediction for Imitation Learning , 2006, NIPS.
[83] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[84] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[85] Ignazio Infantino,et al. A posture sequence learning system for an anthropomorphic robotic hand , 2004, Robotics Auton. Syst..
[86] Daniel H. Grollman,et al. Sparse incremental learning for interactive robot control policy estimation , 2008, 2008 IEEE International Conference on Robotics and Automation.
[87] Biao Huang,et al. System Identification , 2000, Control Theory for Physicists.
[88] Aude Billard,et al. Incremental learning of gestures by imitation in a humanoid robot , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[89] Andrew G. Barto,et al. Learning Skills in Reinforcement Learning Using Relative Novelty , 2005, SARA.
[90] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[91] Radford M. Neal. Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .
[92] Thomas J. Walsh,et al. Teaching and executing verb phrases , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).
[93] Thomas Stibor,et al. Efficient Collapsed Gibbs Sampling for Latent Dirichlet Allocation , 2010, ACML.
[94] Jun Nakanishi,et al. Learning rhythmic movements by demonstration using nonlinear oscillators , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.
[95] Max Welling,et al. Fast collapsed gibbs sampling for latent dirichlet allocation , 2008, KDD.
[96] Manuela M. Veloso,et al. Teaching sequential tasks with repetition through demonstration , 2008, AAMAS.
[97] Robert E. Schapire,et al. A Game-Theoretic Approach to Apprenticeship Learning , 2007, NIPS.
[98] Tetsunari Inamura Masayuki Inaba Hirochika. Acquisition of Probabilistic Behavior Decision Model based on the Interactive Teaching Method , 2001 .
[99] Pieter Abbeel,et al. Apprenticeship learning and reinforcement learning with application to robotic control , 2008 .
[100] M. Opper. Sparse Online Gaussian Processes , 2008 .
[101] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[102] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[103] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[104] Kee-Eung Kim,et al. Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions , 2012, NIPS.
[105] Michael I. Jordan,et al. Variational methods for the Dirichlet process , 2004, ICML.
[106] Pieter Abbeel,et al. Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion , 2007, NIPS.
[107] Roderic A. Grupen,et al. A model of shared grasp affordances from demonstration , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.
[108] Dana H. Ballard,et al. Recognizing teleoperated manipulations , 1993, [1993] Proceedings IEEE International Conference on Robotics and Automation.
[109] Jonathan P. How,et al. Scalable reward learning from demonstration , 2013, 2013 IEEE International Conference on Robotics and Automation.
[110] Avinash C. Kak,et al. Automatic learning of assembly tasks using a DataGlove system , 1995, Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots.
[111] Alexander J. Smola,et al. Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.
[112] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[113] Sara B. Kiesler,et al. Fostering common ground in human-robot interaction , 2005, ROMAN 2005. IEEE International Workshop on Robot and Human Interactive Communication, 2005..