论文信息 - Regression Prior Networks - 字舞流文

Regression Prior Networks

Prior Networks are a recently developed class of models which yield interpretable measures of uncertainty and have been shown to outperform state-of-the-art ensemble approaches on a range of tasks. They can also be used to distill an ensemble of models via Ensemble Distribution Distillation (EnD$^2$), such that its accuracy, calibration and uncertainty estimates are retained within a single model. However, Prior Networks have so far been developed only for classification tasks. This work extends Prior Networks and EnD$^2$ to regression tasks by considering the Normal-Wishart distribution. The properties of Regression Prior Networks are demonstrated on synthetic data, selected UCI datasets and a monocular depth estimation task, where they yield performance competitive with ensemble approaches.

Mark Gales | Andrey Malinin | Ivan Provilkov | Sergey Chervontsev | M. Gales | A. Malinin | Ivan Provilkov | Sergey Chervontsev

[1] Dmitry Vetrov,et al. Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning , 2020, ICLR.

[2] Andreas Geiger,et al. Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[3] Murat Sensoy,et al. Evidential Deep Learning to Quantify Classification Uncertainty , 2018, NeurIPS.

[4] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[5] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.

[6] Andrey Malinin,et al. Reverse KL-Divergence Training of Prior Networks: Improved Uncertainty and Adversarial Robustness , 2019, NeurIPS.

[7] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.

[8] Derek Hoiem,et al. Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[9] Maya R. Gupta,et al. Parametric Bayesian Estimation of Differential Entropy and Relative Entropy , 2010, Entropy.

[10] Dacheng Tao,et al. Deep Ordinal Regression Network for Monocular Depth Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11] Peter Wonka,et al. High Quality Monocular Depth Estimation via Transfer Learning , 2018, ArXiv.

[12] Yarin Gal,et al. BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning , 2019, NeurIPS.

[13] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[14] Mark J. F. Gales,et al. Ensemble Approaches for Uncertainty in Spoken Language Assessment , 2020, INTERSPEECH.

[15] Balaji Lakshminarayanan,et al. Deep Ensembles: A Loss Landscape Perspective , 2019, ArXiv.

[16] Andrew Gordon Wilson,et al. A Simple Baseline for Bayesian Uncertainty in Deep Learning , 2019, NeurIPS.

[17] Kevin Gimpel,et al. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[18] David Wagner,et al. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[19] Andreas Geiger,et al. Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Wilko Schwarting,et al. Deep Evidential Regression , 2020, NeurIPS.

[21] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[22] Javier E. Contreras-Reyes,et al. Shannon Entropy and Mutual Information for Multivariate Skew‐Elliptical Distributions , 2013 .

[23] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[24] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[25] Andrey Malinin,et al. Uncertainty estimation in deep learning with application to spoken language assessment , 2019 .

[26] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[27] Ruben Villegas,et al. Learning to Generate Long-term Future via Hierarchical Prediction , 2017, ICML.

[28] Dustin Tran,et al. BatchEnsemble: An Alternative Approach to Efficient Ensemble and Lifelong Learning , 2020, ICLR.

[29] Yarin Gal,et al. Understanding Measures of Uncertainty for Adversarial Example Detection , 2018, UAI.

[30] Geoffrey Zweig,et al. Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[31] Sebastian Nowozin,et al. Hydra: Preserving Ensemble Diversity for Model Distillation , 2020, ArXiv.

[32] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[33] Sebastian Nowozin,et al. Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift , 2019, NeurIPS.

[34] B. Frey,et al. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[35] Yinda Zhang,et al. LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[36] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[37] Johannes Gehrke,et al. Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[38] Nasser M. Nasrabadi,et al. Pattern Recognition and Machine Learning , 2006, Technometrics.

[39] Rob Fergus,et al. Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[40] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[41] Andrey Malinin,et al. Ensemble Distribution Distillation , 2019, ICLR.

[42] Mark J. F. Gales,et al. Predictive Uncertainty Estimation via Prior Networks , 2018, NeurIPS.