A Distribution-Free Test of Covariate Shift Using Conformal Prediction

Covariate shift is a common and important assumption in transfer learning and domain adaptation to treat the distributional difference between the training and testing data. We propose a nonparametric test of covariate shift using the conformal prediction framework. The construction of our test statistic combines recent developments in conformal prediction with a novel choice of conformity score, resulting in a valid and powerful test statistic under very general settings. To our knowledge, this is the first successful attempt of using conformal prediction for testing statistical hypotheses. Our method is suitable for modern machine learning scenarios where the data has high dimensionality and large sample sizes, and can be effectively combined with existing classification algorithms to find good conformity score functions. The performance of the proposed method is demonstrated in synthetic and real data examples.

[1]  Motoaki Kawanabe,et al.  Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation , 2007, NIPS.

[2]  Emmanuel J. Candès,et al.  Conformal Prediction Under Covariate Shift , 2019, NeurIPS.

[3]  Steffen Bickel,et al.  Discriminative learning for differing training and test distributions , 2007, ICML '07.

[4]  M. Jiménez-Gamero,et al.  Tests for the equality of conditional variance functions in nonparametric regression , 2015 .

[5]  Larry A. Wasserman,et al.  A conformal prediction approach to explore functional data , 2013, Annals of Mathematics and Artificial Intelligence.

[6]  Peter Sollich,et al.  Probabilistic Methods for Support Vector Machines , 1999, NIPS.

[7]  J. Robins,et al.  Distribution-Free Prediction Sets , 2013, Journal of the American Statistical Association.

[8]  Takafumi Kanamori,et al.  A Least-squares Approach to Direct Importance Estimation , 2009, J. Mach. Learn. Res..

[9]  Peter Bühlmann,et al.  p-Values for High-Dimensional Regression , 2008, 0811.2177.

[10]  James Stephen Marron,et al.  Semiparametric Comparison of Regression Curves , 1990 .

[11]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[12]  Donald W. K. Andrews,et al.  A Conditional Kolmogorov Test , 1997 .

[13]  Leying Guan,et al.  Prediction and outlier detection in classification problems , 2019, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[14]  Larry Wasserman,et al.  Distribution‐free prediction bands for non‐parametric regression , 2014 .

[15]  Joseph P. Romano,et al.  Exact tests via multiple data splitting , 2020 .

[16]  Chirag Gupta,et al.  Nested conformal prediction and quantile out-of-bag ensemble methods , 2019, Pattern Recognit..

[17]  Klaus-Robert Müller,et al.  Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..

[18]  Karsten M. Borgwardt,et al.  Covariate Shift by Kernel Mean Matching , 2009, NIPS 2009.

[19]  Peter Hall,et al.  Bootstrap test for difference between means in nonparametric regression , 1990 .

[20]  Masashi Sugiyama,et al.  Direct Density Ratio Estimation for Large-scale Covariate Shift Adaptation , 2008, SDM.

[21]  Jushan Bai,et al.  Testing Parametric Conditional Distributions of Dynamic Models , 2003, Review of Economics and Statistics.

[22]  Alessandro Rinaldo,et al.  Distribution-Free Predictive Inference for Regression , 2016, Journal of the American Statistical Association.

[23]  K. B. Kulasekera,et al.  Smoothing Parameter Selection for Power Optimality in Testing of Regression Curves , 1997 .

[24]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[25]  Y. Qin Inferences for case-control and semiparametric two-sample density ratio models , 1998 .

[26]  Ryan J. Tibshirani,et al.  Predictive inference with the jackknife+ , 2019, The Annals of Statistics.

[27]  A. Rinaldo,et al.  Bootstrapping and sample splitting for high-dimensional, assumption-lean inference , 2016, The Annals of Statistics.

[28]  W. Gasarch,et al.  The Book Review Column 1 Coverage Untyped Systems Simple Types Recursive Types Higher-order Systems General Impression 3 Organization, and Contents of the Book , 2022 .

[29]  Chen Xu,et al.  Predictive inference is free with the jackknife+-after-bootstrap , 2020, NeurIPS.

[30]  Motoaki Kawanabe,et al.  Machine Learning in Non-Stationary Environments - Introduction to Covariate Shift Adaptation , 2012, Adaptive computation and machine learning.

[31]  C. Chu,et al.  Semiparametric density estimation under a two-sample density ratio model , 2004 .

[32]  Gabriela Csurka,et al.  Domain Adaptation in Computer Vision Applications , 2017, Advances in Computer Vision and Pattern Recognition.

[33]  Steffen Bickel,et al.  Discriminative Learning Under Covariate Shift , 2009, J. Mach. Learn. Res..

[34]  L. Wasserman,et al.  HIGH DIMENSIONAL VARIABLE SELECTION. , 2007, Annals of statistics.

[35]  J. Zheng,et al.  A CONSISTENT TEST OF CONDITIONAL PARAMETRIC DISTRIBUTIONS , 2000, Econometric Theory.

[36]  Qi Li,et al.  A NONPARAMETRIC BOOTSTRAP TEST OF CONDITIONAL DISTRIBUTIONS , 2006, Econometric Theory.

[37]  Holger Dette,et al.  Nonparametric comparison of regression curves: An empirical process approach , 2003 .

[38]  Wouter M. Kouw An introduction to domain adaptation and transfer learning , 2018, ArXiv.

[39]  Arun K. Kuchibhotla,et al.  Nested Conformal Prediction and the Generalized Jackknife , 2019 .

[40]  G. Shafer,et al.  Algorithmic Learning in a Random World , 2005 .

[41]  K. B. Kulasekera Comparison of Regression Curves Using Quasi-Residuals , 1995 .

[42]  Norman R. Swanson,et al.  Bootstrap Conditional Distribution Tests in the Presence of Dynamic Misspecification , 2003 .