Learning to Make Safe Real-Time Decisions Under Uncertainty for Autonomous Robots

Robots are increasingly expected to go beyond controlled environments in laboratories and factories, to act autonomously in real-world workplaces and public spaces. Autonomous robots navigating the ...

[1]  M. Arntz,et al.  The Risk of Automation for Jobs in OECD Countries: A Comparative Analysis , 2016 .

[2]  Fredrik Heintz,et al.  Receding-Horizon Lattice-Based Motion Planning with Dynamic Obstacle Avoidance , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[3]  Pieter Abbeel,et al.  Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[4]  Michael A. Osborne,et al.  The future of employment: How susceptible are jobs to computerisation? , 2017 .

[5]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[6]  Johan Dahlin,et al.  Real-Time Robotic Search using Structural Spatial Point Processes , 2019, UAI.

[7]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[8]  D. Mayne A Second-order Gradient Method for Determining Optimal Trajectories of Non-linear Discrete-time Systems , 1966 .

[9]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[10]  F. Knight The economic nature of the firm: From Risk, Uncertainty, and Profit , 2009 .

[11]  S. Martino Approximate Bayesian Inference for Latent Gaussian Models , 2007 .

[12]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[13]  E. Feron,et al.  Real-time motion planning for agile autonomous vehicles , 2000, Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148).

[14]  Stefan Schaal,et al.  Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[15]  L. Blackmore,et al.  Convex Chance Constrained Predictive Control without Sampling , 2009 .

[16]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[17]  Emanuel Todorov,et al.  Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.

[18]  C. Tomlin,et al.  Closed-loop belief space planning for linear, Gaussian systems , 2011, 2011 IEEE International Conference on Robotics and Automation.

[19]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[20]  Patrick Doherty,et al.  Model-Based Reinforcement Learning in Continuous Environments Using Real-Time Constrained Optimization , 2015, AAAI.

[21]  Matthew W. Hoffman,et al.  Predictive Entropy Search for Bayesian Optimization with Unknown Constraints , 2015, ICML.

[22]  Sham M. Kakade,et al.  Towards Generalization and Simplicity in Continuous Control , 2017, NIPS.

[23]  Olov Andersson,et al.  Deep RL for autonomous robots: limitations and safety challenges , 2019, ESANN.

[24]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control 3rd Edition, Volume II , 2010 .

[25]  Jan Peters,et al.  A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[26]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[27]  Benjamin Recht,et al.  Simple random search of static linear policies is competitive for reinforcement learning , 2018, NeurIPS.

[28]  Manfred Morari,et al.  Efficient interior point methods for multistage problems arising in receding horizon control , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[29]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[30]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[31]  Moritz Diehl,et al.  CasADi: a software framework for nonlinear optimization and optimal control , 2018, Mathematical Programming Computation.

[32]  A. Mesbah,et al.  Stochastic Model Predictive Control: An Overview and Perspectives for Future Research , 2016, IEEE Control Systems.

[33]  Yuval Tassa,et al.  Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[34]  Carl E. Rasmussen,et al.  PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos , 2019, ICML.

[35]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[36]  Fei Gao,et al.  Robust and Efficient Quadrotor Trajectory Generation for Fast Autonomous Flight , 2019, IEEE Robotics and Automation Letters.

[37]  Martin A. Riedmiller,et al.  Approximate real-time optimal control based on sparse Gaussian process models , 2014, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[38]  C. Rasmussen,et al.  Improving PILCO with Bayesian Neural Network Dynamics Models , 2016 .

[39]  Vijay Kumar,et al.  Minimum snap trajectory generation and control for quadrotors , 2011, 2011 IEEE International Conference on Robotics and Automation.

[40]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[41]  Pieter Abbeel,et al.  Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[42]  Christopher G. Atkeson,et al.  Nonparametric Model-Based Reinforcement Learning , 1997, NIPS.

[43]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[44]  H. Simon,et al.  A Behavioral Model of Rational Choice , 1955 .

[45]  Moritz Diehl,et al.  ACADO toolkit—An open‐source framework for automatic control and dynamic optimization , 2011 .

[46]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[47]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[48]  Fabio Tozeto Ramos,et al.  Sequential Bayesian optimization as a POMDP for environment monitoring with UAVs , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[49]  Roland Siegwart,et al.  Sparse 3D Topological Graphs for Micro-Aerial Vehicle Planning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[50]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[51]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[52]  Doreen Meier,et al.  Introduction To Stochastic Control Theory , 2016 .

[53]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[54]  Jasper Snoek,et al.  Bayesian Optimization with Unknown Constraints , 2014, UAI.

[55]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[56]  Sebastian Thrun,et al.  Probabilistic robotics , 2002, CACM.

[57]  Panos M. Pardalos,et al.  Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[58]  Patrick Doherty,et al.  Model-predictive control with stochastic collision avoidance using Bayesian policy optimization , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[59]  Javier García,et al.  A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..

[60]  Patrick Doherty,et al.  Deep Learning Quadcopter Control via Risk-Aware Active Learning , 2017, AAAI.

[61]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[62]  Yunpeng Pan,et al.  Probabilistic Differential Dynamic Programming , 2014, NIPS.

[63]  Sham M. Kakade,et al.  A Natural Policy Gradient , 2001, NIPS.

[64]  Tryphon T. Georgiou,et al.  The Separation Principle in Stochastic Control, Redux , 2011, IEEE Transactions on Automatic Control.

[65]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[66]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..