Learning to search: structured prediction techniques for imitation learning
暂无分享,去创建一个
[1] A. Wightman,et al. Mathematical Physics. , 1930, Nature.
[2] R. E. Kalman,et al. When Is a Linear Control System Optimal , 1964 .
[3] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.
[4] Naum Zuselevich Shor,et al. Minimization Methods for Non-Differentiable Functions , 1985, Springer Series in Computational Mathematics.
[5] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[6] Manfred K. Warmuth,et al. The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.
[7] B. Anderson,et al. Optimal control: linear quadratic methods , 1990 .
[8] Steven Dubowsky,et al. On computing the global time-optimal motions of robotic manipulators in the presence of obstacles , 1991, IEEE Trans. Robotics Autom..
[9] Stefan Schaal,et al. Open loop stable control strategies for robot juggling , 1993, [1993] Proceedings IEEE International Conference on Robotics and Automation.
[10] Oussama Khatib,et al. Elastic bands: connecting path planning and control , 1993, [1993] Proceedings IEEE International Conference on Robotics and Automation.
[11] Philip M. Long,et al. WORST-CASE QUADRATIC LOSS BOUNDS FOR ON-LINE PREDICTION OF LINEAR FUNCTIONS BY GRADIENT DESCENT , 1993 .
[12] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[13] Claude Sammut,et al. A Framework for Behavioural Cloning , 1995, Machine Intelligence 15.
[14] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[15] B. Faverjon,et al. Probabilistic Roadmaps for Path Planning in High-Dimensional Con(cid:12)guration Spaces , 1996 .
[16] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.
[17] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..
[18] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[19] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[20] E. Yaz. Linear Matrix Inequalities In System And Control Theory , 1998, Proceedings of the IEEE.
[21] Lydia E. Kavraki,et al. Probabilistic Roadmaps for Robot Path Planning , 1998 .
[22] Yong K. Hwang,et al. SANDROS: a dynamic graph search algorithm for motion planning , 1998, IEEE Trans. Robotics Autom..
[23] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .
[24] Geoffrey J. Gordon,et al. Approximate solutions to markov decision processes , 1999 .
[25] Andrew McCallum,et al. Using Maximum Entropy for Text Classification , 1999 .
[26] Vladimir J. Lumelsky,et al. Biped robot locomotion in scenes with unknown obstacles , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).
[27] Steven M. LaValle,et al. RRT-connect: An efficient approach to single-query path planning , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).
[28] Shun-ichi Amari,et al. Methods of information geometry , 2000 .
[29] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[30] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[31] Peter L. Bartlett,et al. Functional Gradient Techniques for Combining Hypotheses , 2000 .
[32] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .
[33] Bernhard Schölkopf,et al. A Generalized Representer Theorem , 2001, COLT/EuroCOLT.
[34] D. Bertsekas,et al. Convergen e Rate of In remental Subgradient Algorithms , 2000 .
[35] Mark Herbster,et al. Tracking the Best Linear Predictor , 2001, J. Mach. Learn. Res..
[36] Yoram Baram,et al. Manifold Stochastic Dynamics for Bayesian Learning , 1999, Neural Computation.
[37] Alexander J. Smola,et al. Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.
[38] B. Moor,et al. Mixed integer programming for multi-vehicle path planning , 2001, 2001 European Control Conference (ECC).
[39] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[40] Tong Zhang,et al. Covering Number Bounds of Certain Regularized Linear Function Classes , 2002, J. Mach. Learn. Res..
[41] Martin J. Wainwright,et al. Stochastic processes on graphs with cycles: geometric and variational approaches , 2002 .
[42] Ben Taskar,et al. Max-Margin Markov Networks , 2003, NIPS.
[43] Henrik I. Christensen,et al. Automatic grasp planning using shape primitives , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).
[44] J. Chestnutt,et al. Planning Biped Navigation Strategies in Complex Environments , 2003 .
[45] Jeff G. Schneider,et al. Covariant policy search , 2003, IJCAI 2003.
[46] O. Brock,et al. Elastic Strips: A Framework for Motion Generation in Human Environments , 2002, Int. J. Robotics Res..
[47] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[48] Bernhard Schölkopf,et al. A tutorial on support vector regression , 2004, Stat. Comput..
[49] Alexander J. Smola,et al. Online learning with kernels , 2001, IEEE Transactions on Signal Processing.
[50] Manfred K. Warmuth,et al. Relative Loss Bounds for Multidimensional Regression Problems , 1997, Machine Learning.
[51] Brian Roark,et al. Incremental Parsing with the Perceptron Algorithm , 2004, ACL.
[52] Joel A. Tropp,et al. Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.
[53] Claudio Gentile,et al. On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.
[54] A. Moore,et al. Learning decisions: robustness, uncertainty, and approximation , 2004 .
[55] Ben Taskar,et al. Exponentiated Gradient Algorithms for Large-margin Structured Classification , 2004, NIPS.
[56] Gunnar Rätsch,et al. Matrix Exponentiated Gradient Updates for On-line Learning and Bregman Projection , 2004, J. Mach. Learn. Res..
[57] Raffaello D'Andrea,et al. Iterative MILP methods for vehicle-control problems , 2005, IEEE Transactions on Robotics.
[58] Ji Zhu,et al. Boosting as a Regularized Path to a Maximum Margin Classifier , 2004, J. Mach. Learn. Res..
[59] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[60] Takeo Kanade,et al. Footstep Planning for the Honda ASIMO Humanoid , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.
[61] Ben Taskar,et al. Discriminative learning of Markov random fields for segmentation of 3D scan data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[62] Ben Taskar,et al. Learning structured prediction models: a large margin approach , 2005, ICML.
[63] Yann LeCun,et al. Off-Road Obstacle Avoidance through End-to-End Learning , 2005, NIPS.
[64] Thomas Hofmann,et al. Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..
[65] Ben Taskar,et al. Structured Prediction via the Extragradient Method , 2005, NIPS.
[66] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[67] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[68] Brett Browning,et al. Learning to Predict Driver Route and Destination Intent , 2006, 2006 IEEE Intelligent Transportation Systems Conference.
[69] Thorsten Joachims,et al. Training linear SVMs in linear time , 2006, KDD '06.
[70] Chaitanya Swamy,et al. An approximation scheme for stochastic linear programming and its application to stochastic integer programs , 2006, JACM.
[71] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[72] Fu Jie Huang,et al. A Tutorial on Energy-Based Learning , 2006 .
[73] C.S. Ma,et al. MILP optimal path planning for real-time applications , 2006, 2006 American Control Conference.
[74] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[75] Mark H. Overmars,et al. Creating High-quality Roadmaps for Motion Planning in Virtual Environments , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[76] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[77] David M. Bradley,et al. Boosting Structured Prediction for Imitation Learning , 2006, NIPS.
[78] Richard Szeliski,et al. A Comparative Study of Energy Minimization Methods for Markov Random Fields , 2006, ECCV.
[79] Nathan Ratliff,et al. Online) Subgradient Methods for Structured Prediction , 2007 .
[80] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[81] T. Poggio,et al. Regularized Least-Squares Classification 133 In practice , although , 2007 .
[82] Pieter Abbeel,et al. Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion , 2007, NIPS.
[83] Alexander J. Smola,et al. Bundle Methods for Machine Learning , 2007, NIPS.
[84] Siddhartha S. Srinivasa,et al. Imitation learning for locomotion and manipulation , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.
[85] Csaba Szepesvári,et al. Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods , 2007, UAI.
[86] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.
[87] J. Andrew Bagnell,et al. Kernel Conjugate Gradient for Fast Kernel Machines , 2007, IJCAI.
[88] Eyal Amir,et al. Bayesian Inverse Reinforcement Learning , 2007, IJCAI.
[89] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[90] Anind K. Dey,et al. Navigate like a cabbie: probabilistic reasoning from observed context-aware behavior , 2008, UbiComp.
[91] David Silver,et al. High Performance Outdoor Navigation from Overhead Data using Imitation Learning , 2008, Robotics: Science and Systems.
[92] Steven L. Waslander,et al. Tunnel-MILP: Path Planning with Sequential Convex Polytopes , 2008, AIAA Guidance, Navigation and Control Conference and Exhibit.
[93] Nathan Srebro,et al. SVM optimization: inverse dependence on training set size , 2008, ICML '08.
[94] Martial Hebert,et al. Directional Associative Markov Network for 3-D Point Cloud Classification , 2008 .
[95] Elisa Ricci,et al. Large Margin Methods for Structured Output Prediction , 2008, Computational Intelligence Paradigms.
[96] John Krumm. Number 2008-01-0195 A Markov Model for Driver Turn Prediction , 2008 .
[97] Pieter Abbeel,et al. Learning for control from multiple demonstrations , 2008, ICML '08.
[98] Siddhartha S. Srinivasa,et al. CHOMP: Gradient optimization techniques for efficient motion planning , 2009, 2009 IEEE International Conference on Robotics and Automation.
[99] Siddhartha S. Srinivasa,et al. Inverse Optimal Heuristic Control for Imitation Learning , 2009, AISTATS.
[100] David Silver,et al. Learning to search: Functional gradient techniques for imitation learning , 2009, Auton. Robots.
[101] Martial Hebert,et al. Contextual classification with functional Max-Margin Markov Networks , 2009, CVPR.
[102] T. Banchoff,et al. Differential Geometry of Curves and Surfaces , 2010 .
[103] Radford M. Neal. Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .
[104] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..