Measuring the sensitivity of Gaussian processes to kernel choice

Gaussian processes (GPs) are used to make medical and scientific decisions, including in cardiac care and monitoring of carbon dioxide emissions. But the choice of GP kernel is often somewhat arbitrary. In particular, uncountably many kernels typically align with qualitative prior knowledge (e.g. function smoothness or stationarity). But in practice, data analysts choose among a handful of convenient standard kernels (e.g. squared exponential). In the present work, we ask: Would decisions made with a GP differ under other, qualitatively interchangeable kernels? We show how to formulate this sensitivity analysis as a constrained optimization problem over a finite-dimensional space. We can then use standard optimizers to identify substantive changes in relevant decisions made with a GP. We demonstrate in both synthetic and real-world examples that decisions made with a GP can exhibit substantial sensitivity to kernel choice, even when prior draws are qualitatively interchangeable to a user.

[1]  Kai Li,et al.  Sparse multi-output Gaussian processes for online medical time series prediction , 2020, BMC Medical Informatics and Decision Making.

[2]  Hyun-Chul Kim,et al.  Outlier Robust Gaussian Process Classification , 2008, SSPR/SPR.

[3]  P. Gustafson Local sensitivity of posterior expectations , 1996 .

[4]  L. Wasserman,et al.  Linearization of Bayesian robustness problems , 1993 .

[5]  Mark Girolami,et al.  Convergence Guarantees for Gaussian Process Means With Misspecified Likelihoods and Smoothness , 2020, J. Mach. Learn. Res..

[6]  Ashish Sharma,et al.  Early Prediction of Sepsis from Clinical Data: the PhysioNet/Computing in Cardiology Challenge 2019 , 2019, 2019 Computing in Cardiology (CinC).

[7]  Andrew Gordon Wilson,et al.  Deep Kernel Learning , 2015, AISTATS.

[8]  Richard E. Turner,et al.  Gaussian Process Behaviour in Wide Deep Neural Networks , 2018, ICLR.

[9]  Francesca Dominici,et al.  Bayesian Modeling for Exposure Response Curve via Gaussian Processes: Causal Effects of Exposure to Air Pollution on Health Outcomes , 2021 .

[10]  Daniel Hernández-Lobato,et al.  Robust Multi-Class Gaussian Process Classification , 2011, NIPS.

[11]  Katherine A. Heller,et al.  Learning to Detect Sepsis with a Multitask Gaussian Process RNN Classifier , 2017, ICML.

[12]  Jaehoon Lee,et al.  Deep Neural Networks as Gaussian Processes , 2017, ICLR.

[13]  D. Dunson,et al.  IDENTIFYING MAIN EFFECTS AND INTERACTIONS AMONG EXPOSURES USING GAUSSIAN PROCESSES. , 2019, The annals of applied statistics.

[14]  Michael I. Jordan,et al.  Evaluating Sensitivity to the Stick-Breaking Prior in Bayesian Nonparametrics , 2018, Bayesian Analysis.

[15]  Aki Vehtari,et al.  Robust Gaussian Process Regression with a Student-t Likelihood , 2011, J. Mach. Learn. Res..

[16]  Jorge Arroyo Palacios,et al.  Understanding heart rate alarm adjustment in the intensive care units through an analytical approach , 2017, PloS one.

[17]  Michael I. Jordan,et al.  Covariances, Robustness, and Variational Bayes , 2017, J. Mach. Learn. Res..

[18]  S. Sahu,et al.  A rigorous statistical framework for spatio‐temporal pollution prediction and estimation of its long‐term impact on health , 2016, Biostatistics.

[19]  Bing-Yi Jing,et al.  Convergence of Gaussian process regression: Optimality, robustness, and relationship with kernel ridge regression. , 2021, 2104.09778.

[20]  Aki Vehtari,et al.  Visualization in Bayesian workflow , 2017, Journal of the Royal Statistical Society: Series A (Statistics in Society).

[21]  Jaehoon Lee,et al.  Neural Tangents: Fast and Easy Infinite Neural Networks in Python , 2019, ICLR.

[22]  Andrew Gordon Wilson,et al.  Gaussian Process Kernels for Pattern Discovery and Extrapolation , 2013, ICML.

[23]  Jeffrey M. Hausdorff,et al.  Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[24]  Katherine A. Heller,et al.  An Improved Multi-Output Gaussian Process RNN with Real-Time Validation for Early Sepsis Detection , 2017, MLHC.

[25]  Biao Huang,et al.  Robust Gaussian process modeling using EM algorithm , 2016 .

[26]  David Duvenaud,et al.  Automatic model construction with Gaussian processes , 2014 .

[27]  David A. Clifton,et al.  Bayesian Gaussian processes for identifying the deteriorating patient , 2016, 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[28]  Luca Cardelli,et al.  Robustness Guarantees for Bayesian Inference with Gaussian Processes , 2019, AAAI.

[29]  K. Jarrod Millman,et al.  Array programming with NumPy , 2020, Nat..

[30]  R. F. Keeling,et al.  Atmospheric CO2 Records from Sites in the Scripps Institution of Oceanography (SIO) Air Sampling Network (1985 - 2007) , 2008 .

[31]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[32]  Joshua B. Tenenbaum,et al.  Structure Discovery in Nonparametric Regression through Compositional Kernel Search , 2013, ICML.

[33]  L. M. Berliner,et al.  Robust Bayes and Empirical Bayes Analysis with #-Contaminated Priors , 2007 .

[34]  Joel Nothman,et al.  SciPy 1.0-Fundamental Algorithms for Scientific Computing in Python , 2019, ArXiv.

[35]  James O. Berger,et al.  An overview of robust Bayesian analysis , 1994 .

[36]  Volkan Cevher,et al.  Adversarially Robust Optimization with Gaussian Processes , 2018, NeurIPS.