Selection of Predictors in Distance-Based Regression

Distance-based regression is a prediction method consisting of two steps: from distances between observations we obtain latent variables which, in turn, are the regressors in an ordinary least squares linear model. Distances are computed from actually observed predictors by means of a suitable dissimilarity function. Being generally nonlinearly related with the response, their selection by the usual F tests is unavailable. In this article, we propose a solution to this predictor selection problem by defining generalized test statistics and adapting a nonparametric bootstrap method to estimate their p-values. We include a numerical example with automobile insurance data.

[1]  J. Gower A General Coefficient of Similarity and Some of Its Properties , 1971 .

[2]  Carles M. Cuadras,et al.  Some computational aspects of a distance—based model for prediction , 1996 .

[3]  C. M. Cuadras,et al.  A distance based regression model for prediction with mixed data , 1990 .

[4]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[5]  Steven Haberman,et al.  Modern Actuarial Theory and Practice , 1998 .

[6]  J. Fox Bootstrapping Regression Models , 2002 .

[7]  Debashis Kushary,et al.  Bootstrap Methods and Their Application , 2000, Technometrics.

[8]  M. J. Brockman,et al.  Statistical motor rating: making effective use of your data , 1992 .

[9]  J. Gower Some distance properties of latent root and vector methods used in multivariate analysis , 1966 .

[10]  J. Gower,et al.  Metric and Euclidean properties of dissimilarity coefficients , 1986 .

[11]  Emmanuel Flachaire A better way to bootstrap pairs , 1999 .

[12]  Carles M. Cuadras,et al.  DISTANCE ANALYSIS IN DISCRIMINATION AND CLASSIFICATION USING BOTH CONTINUOUS AND CATEGORICAL VARIABLES , 1989 .

[13]  Steven Haberman,et al.  Generalized linear models and actuarial science , 1996 .

[14]  R. Wehrens,et al.  Bootstrapping principal component regression models , 1997 .

[15]  Anthony C. Davison,et al.  Bootstrap Methods and Their Application , 1998 .

[16]  Eva Boj del Val,et al.  Bases de datos y estadísticas del seguro de automóviles en España: influencia en el cálculo de primas , 2005 .

[17]  N. L. Johnson,et al.  Linear Statistical Inference and Its Applications , 1966 .