Conformal predictive distributions with kernels

This paper reviews the checkered history of predictive distributions in statistics and discusses two developments, one from recent literature and the other new. The first development is bringing predictive distributions into machine learning, whose early development was so deeply influenced by two remarkable groups at the Institute of Automation and Remote Control. The second development is combining predictive distributions with kernel methods, which were originated by one of those groups, including Emmanuel Braverman.

[1]  J. A. Díaz-García,et al.  SENSITIVITY ANALYSIS IN LINEAR REGRESSION , 2022 .

[2]  F. Knight The economic nature of the firm: From Risk, Uncertainty, and Profit , 2009 .

[3]  Minge Xie,et al.  Prediction with confidence—A general framework for predictive inference , 2017 .

[4]  Alexander Gammerman,et al.  Conditional Prediction Intervals for Linear Regression , 2009, 2009 International Conference on Machine Learning and Applications.

[5]  P. Bartlett,et al.  Probabilities for SV Machines , 2000 .

[6]  Ingo Steinwart,et al.  On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..

[7]  Vladimir Vovk Universally consistent predictive distributions , 2017, ArXiv.

[8]  Evgeny Burnaev,et al.  Conformalized Kernel Ridge Regression , 2016, 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).

[9]  D. Cox Some problems connected with statistical inference , 1958 .

[10]  Larry Wasserman Frasian Inference , 2012 .

[11]  W. Gasarch,et al.  The Book Review Column 1 Coverage Untyped Systems Simple Types Recursive Types Higher-order Systems General Impression 3 Organization, and Contents of the Book , 2022 .

[12]  S. R. Searle,et al.  On Deriving the Inverse of a Sum of Matrices , 1981 .

[13]  A. Gammerman,et al.  On-line predictive linear regression , 2005, math/0511522.

[14]  Bianca Zadrozny,et al.  Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[15]  A. P. Dawid,et al.  Present position and potential developments: some personal views , 1984 .

[16]  Vladimir Vovk,et al.  Efficiency of conformalized ridge regression , 2014, COLT.

[17]  S. Chatterjee Sensitivity analysis in linear regression , 1988 .

[18]  Tore Schweder,et al.  Confidence, Likelihood, Probability: Statistical Inference with Confidence Distributions , 2016 .

[19]  Paul W. Goldberg,et al.  Regression with Input-dependent Noise: A Gaussian Process Treatment , 1997, NIPS.

[20]  Christine Thomas-Agnan,et al.  Computing a family of reproducing kernels for statistical applications , 1996, Numerical Algorithms.

[21]  A. Dawid,et al.  Prequential probability: principles and properties , 1999 .

[22]  J. Brian Gray,et al.  Introduction to Linear Regression Analysis , 2002, Technometrics.

[23]  Zoubin Ghahramani,et al.  Variable Noise and Dimensionality Reduction for Sparse Gaussian processes , 2006, UAI.

[24]  Vladimir Vovk,et al.  Nonparametric predictive distributions based on conformal prediction , 2017, Machine Learning.

[25]  Bradley Efron,et al.  R. A. Fisher in the 21st century (Invited paper presented at the 1996 R. A. Fisher Lecture) , 1998 .

[26]  Alexander J. Smola,et al.  Heteroscedastic Gaussian process regression , 2005, ICML.

[27]  Vladimir Vovk,et al.  Venn-Abers Predictors , 2012, UAI.