Planning for Autonomous Cars that Leverage Effects on Human Actions

Traditionally, autonomous cars make predictions about other drivers’ future trajectories, and plan to stay out of their way. This tends to result in defensive and opaque behaviors. Our key insight is that an autonomous car’s actions will actually affect what other cars will do in response, whether the car is aware of it or not. Our thesis is that we can leverage these responses to plan more efficient and communicative behaviors. We model the interaction between an autonomous car and a human driver as a dynamical system, in which the robot’s actions have immediate consequences on the state of the car, but also on human actions. We model these consequences by approximating the human as an optimal planner, with a reward function that we acquire through Inverse Reinforcement Learning. When the robot plans with this reward function in this dynamical system, it comes up with actions that purposefully change human state: it merges in front of a human to get them to slow down or to reach its own goal faster; it blocks two lanes to get them to switch to a third lane; or it backs up slightly at an intersection to get them to proceed first. Such behaviors arise from the optimization, without relying on hand-coded signaling strategies and without ever explicitly modeling communication. Our user study results suggest that the robot is indeed capable of eliciting desired changes in human state by planning using this dynamical system.

[1]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[2]  Hugh F. Durrant-Whyte,et al.  A solution to the simultaneous localization and map building (SLAM) problem , 2001, IEEE Trans. Robotics Autom..

[3]  Marko Bacic,et al.  Model predictive control , 2003 .

[4]  Pieter Abbeel,et al.  Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.

[5]  Francesco Borrelli,et al.  INTEGRATED BRAKING AND STEERING MODEL PREDICTIVE CONTROL APPROACH IN AUTONOMOUS VEHICLES , 2007 .

[6]  Francesco Borrelli,et al.  Predictive Active Steering Control for Autonomous Vehicle Systems , 2007, IEEE Transactions on Control Systems Technology.

[7]  Jianfeng Gao,et al.  Scalable training of L1-regularized log-linear models , 2007, ICML '07.

[8]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[9]  Francesco Borrelli,et al.  MPC-based yaw and lateral stabilisation via active front steering and braking , 2008 .

[10]  Luke Fletcher,et al.  A perception-driven autonomous urban vehicle , 2008 .

[11]  Luke Fletcher,et al.  A perception‐driven autonomous urban vehicle , 2008, J. Field Robotics.

[12]  Christoph Hermes,et al.  Long-term vehicle motion prediction , 2009, 2009 IEEE Intelligent Vehicles Symposium.

[13]  William Whittaker,et al.  Autonomous driving in urban environments: Boss and the Urban Challenge , 2008, J. Field Robotics.

[14]  Andreas Krause,et al.  Unfreezing the robot: Navigation in dense, interacting crowds , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  J. Andrew Bagnell,et al.  Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .

[16]  J. How,et al.  Chance Constrained RRT for Probabilistic Robustness to Environmental Uncertainty , 2010 .

[17]  Sebastian Thrun,et al.  Towards fully autonomous driving: Systems and algorithms , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[18]  Wolfram Burgard,et al.  Feature-Based Prediction of Trajectories for Socially Compliant Navigation , 2012, Robotics: Science and Systems.

[19]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[20]  Ruzena Bajcsy,et al.  Safe semi-autonomous control with enhanced driver modeling , 2012, 2012 American Control Conference (ACC).

[21]  Sergey Levine,et al.  Continuous Inverse Optimal Control with Locally Optimal Examples , 2012, ICML.

[22]  Francesco Borrelli,et al.  Robust Predictive Control for semi-autonomous vehicles with an uncertain driver model , 2013, 2013 IEEE Intelligent Vehicles Symposium (IV).

[23]  Claire J. Tomlin,et al.  A probabilistic approach to planning and control in autonomous urban driving , 2013, 52nd IEEE Conference on Decision and Control.

[24]  Masamichi Shimosaka,et al.  Modeling risk anticipation and defensive driving on residential roads with inverse reinforcement learning , 2014, 17th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[25]  Stefanos Nikolaidis,et al.  Efficient Model Learning from Joint-Action Demonstrations for Human-Robot Collaborative Tasks , 2015, 2015 10th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[26]  Sanjit A. Seshia,et al.  Reactive synthesis from signal temporal logic specifications , 2015, HSCC.

[27]  Wolfram Burgard,et al.  Learning driving styles for autonomous vehicles from demonstration , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Francesco Borrelli,et al.  Driver models for personalised driving assistance , 2015 .

[29]  Ashish Kapoor,et al.  Safe Control under Uncertainty , 2015, ArXiv.