Cooperation for Scalable Supervision of Autonomy in Mixed Traffic

Improvements in autonomy offer the potential for positive outcomes in a number of domains, yet guaranteeing their safe deployment is difficult. This work investigates how humans can intelligently supervise agents to achieve some level of safety even when performance guarantees are elusive. The motivating research question is: In safety-critical settings, can we avoid the need to have one human supervise one machine at all times? The paper formalizes this ‘scaling supervision’ problem, and investigates its application to the safety-critical context of autonomous vehicles (AVs) merging into traffic. It proposes a conservative, reachability-based method to reduce the burden on the AVs’ human supervisors, which allows for the establishment of high-confidence upper bounds on the supervision requirements in this setting. Order statistics and traffic simulations with deep reinforcement learning show analytically and numerically that teaming of AVs enables supervision time sublinear in AV adoption. A key takeaway is that, despite present imperfections of AVs, supervision becomes more tractable as AVs are deployed en masse. While this work focuses on AVs, the scalable supervision framework is relevant to a broader array of autonomous control challenges.

[1]  David Parker,et al.  Probabilistic Guarantees for Safe Deep Reinforcement Learning , 2020, FORMATS.

[2]  Ofir Nachum,et al.  A Lyapunov-based Approach to Safe Reinforcement Learning , 2018, NeurIPS.

[3]  Wendy Ju,et al.  Exploring shared control in automated driving , 2016, 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[4]  Pravin Varaiya,et al.  Ellipsoidal Techniques for Reachability Analysis , 2000, HSCC.

[5]  Martin Treiber,et al.  Traffic Flow Dynamics: Data, Models and Simulation , 2012 .

[6]  Andreas Krause,et al.  Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.

[7]  Ibrahim Yilmaz,et al.  Geographical information systems aided traffic accident analysis system case study: city of Afyonkarahisar. , 2008, Accident; analysis and prevention.

[8]  Helbing,et al.  Congested traffic states in empirical observations and microscopic simulations , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[9]  Ali Esmaili,et al.  Probability and Random Processes , 2005, Technometrics.

[10]  Andreas A. Malikopoulos,et al.  Automated and Cooperative Vehicle Merging at Highway On-Ramps , 2017, IEEE Transactions on Intelligent Transportation Systems.

[11]  Jacob W. Crandall,et al.  Predicting Operator Capacity for Supervisory Control of Multiple UAVs , 2007, Innovations in Intelligent Machines.

[12]  Kanok Boriboonsomsin,et al.  Real-World Carbon Dioxide Impacts of Traffic Congestion , 2008 .

[13]  R. Syski,et al.  Fundamentals of Queueing Theory , 1999, Technometrics.

[14]  Mo Chen,et al.  Hamilton-Jacobi reachability: A brief overview and recent advances , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[15]  Alexandre M. Bayen,et al.  Flow: A Modular Learning Framework for Mixed Autonomy Traffic , 2017, IEEE Transactions on Robotics.

[16]  Mary L. Cummings,et al.  Automation Architecture for Single Operator, Multiple UAV Command and Control, , 2007 .

[17]  Owain Evans,et al.  Trial without Error: Towards Safe Reinforcement Learning via Human Intervention , 2017, AAMAS.

[18]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[19]  Daniel S. Drew,et al.  Multi-Agent Systems for Search and Rescue Applications , 2021, Current Robotics Reports.

[20]  Dan R. Olsen,et al.  Fan-out: measuring human control of multiple robots , 2004, CHI.

[21]  Antoine Girard,et al.  Set Propagation Techniques for Reachability Analysis , 2021, Annu. Rev. Control. Robotics Auton. Syst..

[22]  Marco Pavone,et al.  On infusing reachability-based safety assurance within planning frameworks for human–robot vehicle interactions , 2020, Int. J. Robotics Res..

[23]  Igor Gilitschenski,et al.  Sampling-Based Approximation Algorithms for Reachability Analysis with Provable Guarantees , 2018, Robotics: Science and Systems.

[24]  Dan R. Olsen,et al.  Metrics for Evaluating Human-Robot Interactions , 2003 .

[25]  James Humann,et al.  Human Factors in the Scalability of Multirobot Operation: A Review and Simulation , 2019, 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC).

[26]  Ps Hu SUMMARY OF TRAVEL TRENDS: 2001 NATIONAL HOUSEHOLD TRAVEL SURVEY , 2004 .

[27]  Andrew Cahill Catastrophic Forgetting in Reinforcement-Learning Environments , 2010 .

[28]  Yun-Pang Flötteröd,et al.  Microscopic Traffic Simulation using SUMO , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[29]  Angela Ribeiro,et al.  Distributed Multi-Level Supervision to Effectively Monitor the Operations of a Fleet of Autonomous Vehicles in Agricultural Tasks , 2015, Sensors.

[30]  Martin Treiber,et al.  Traffic Flow Dynamics , 2013 .

[31]  Anca D. Dragan,et al.  Scaled Autonomy: Enabling Human Operators to Control Robot Fleets , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[32]  Kerstin Sophie Haring A Cognitive Model of Drivers Attention , 2012 .

[33]  S. Monsell,et al.  Costs of a predictible switch between simple cognitive tasks. , 1995 .

[34]  Hwasoo Yeo,et al.  Impact of Autonomous-Vehicle-Only Lanes in Mixed Traffic Conditions , 2019, Transportation Research Record: Journal of the Transportation Research Board.

[35]  Xiaobo Qu,et al.  On the Impact of Cooperative Autonomous Vehicles in Improving Freeway Merging: A Modified Intelligent Driver Model-Based Approach , 2017, IEEE Transactions on Intelligent Transportation Systems.

[36]  Alexandre Bayen,et al.  Multi-Adversarial Safety Analysis for Autonomous Vehicles , 2020, ArXiv.

[37]  Taehoon Kim,et al.  Quantifying Generalization in Reinforcement Learning , 2018, ICML.

[38]  Sameera S. Ponda,et al.  Autonomous navigation of stratospheric balloons using reinforcement learning , 2020, Nature.

[39]  Elias B. Kosmatopoulos,et al.  Collision avoidance analysis for lane changing and merging , 1999, IEEE Trans. Veh. Technol..

[40]  Mykel J. Kochenderfer,et al.  Reinforcement Learning with Probabilistic Guarantees for Autonomous Driving , 2019, ArXiv.

[41]  Cathy Wu,et al.  Learning and Optimization for Mixed Autonomy Systems - A Mobility Context , 2018 .

[42]  Aditya Mahajan,et al.  Scalable Operator Allocation for Multirobot Assistance: A Restless Bandit Approach , 2021, IEEE Transactions on Control of Network Systems.

[43]  David Hambling AI outguns a human fighter pilot , 2020 .

[44]  Y. Sugiyama,et al.  Traffic jams without bottlenecks—experimental evidence for the physical mechanism of the formation of a jam , 2008 .

[45]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.