Benchmark for Out-of-Distribution Detection in Deep Reinforcement Learning

Reinforcement Learning (RL) based solutions are being adopted in a variety of domains including robotics, health care and industrial automation. Most focus is given to when these solutions work well, but they fail when presented with out of distribution inputs. RL policies share the same faults as most machine learning models. Out of distribution detection for RL is generally not well covered in the literature, and there is a lack of benchmarks for this task. In this work we propose a benchmark to evaluate OOD detection methods in a Reinforcement Learning setting, by modifying the physical parameters of non-visual standard environments or corrupting the state observation for visual environments. We discuss ways to generate custom RL environments that can produce OOD data, and evaluate three uncertainty methods for the OOD detection task. Our results show that ensemble methods have the best OOD detection performance with a lower standard deviation across multiple environments.

[1]  Pramod K. Varshney,et al.  Anomalous Example Detection in Deep Learning: A Survey , 2020, IEEE Access.

[2]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[3]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[4]  Keith Sullivan,et al.  Finding Anomalies with Generative Adversarial Networks for a Patrolbot , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Thomas G. Dietterich,et al.  Open Category Detection with PAC Guarantees , 2018, ICML.

[6]  Demis Hassabis,et al.  A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.

[7]  Hojjat Adeli,et al.  A dynamic ensemble learning algorithm for neural networks , 2019, Neural Computing and Applications.

[8]  Jonathan P. How,et al.  Safe Reinforcement Learning With Model Uncertainty Estimates , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[9]  Lenz Belzner,et al.  Uncertainty-Based Out-of-Distribution Detection in Deep Reinforcement Learning , 2019, ArXiv.

[10]  Dustin Tran,et al.  Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches , 2018, ICLR.

[11]  Kibok Lee,et al.  Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples , 2017, ICLR.

[12]  Naveen Garg,et al.  DropConnect is effective in modeling uncertainty of Bayesian deep networks , 2019, Scientific Reports.

[13]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[14]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[15]  R. Srikant,et al.  Principled Detection of Out-of-Distribution Examples in Neural Networks , 2017, ArXiv.

[16]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[17]  J. Glinsky,et al.  The general. , 1982, Nursing.

[18]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[19]  Hanno Gottschalk,et al.  Classification Uncertainty of Deep Neural Networks Based on Gradient Information , 2018, ANNPR.

[20]  Alexander S. Ecker,et al.  Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming , 2019, ArXiv.

[21]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[22]  Roberto Caldelli,et al.  Adversarial Examples Detection in Features Distance Spaces , 2018, ECCV Workshops.

[23]  D. Song,et al.  The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[25]  Yuanlai Cui,et al.  A reinforcement learning approach to irrigation decision-making for rice using weather forecasts , 2021 .

[26]  Sebastian Nowozin,et al.  Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift , 2019, NeurIPS.

[27]  Joonho Lee,et al.  Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[28]  Jasper Snoek,et al.  Likelihood Ratios for Out-of-Distribution Detection , 2019, NeurIPS.