A flexible approach to inference in semiparametric regression models with correlated errors using Gaussian processes

Consider a semiparametric regression model in which the mean function depends on a finite-dimensional regression parameter as the parameter of interest and an unknown function as a nuisance parameter. A method of inference in such models is proposed, using a type of integrated likelihood in which the unknown function is eliminated by averaging with respect to a given distribution, which we take to be a Gaussian process with a covariance function chosen to reflect the assumptions about the function. This approach is easily implemented and can be applied to a wide range of models using the same basic methodology. The consistency and asymptotic normality of the estimator of the parameter of interest are established under mild conditions. The proposed method is illustrated on several examples.

[1]  R. Tibshirani,et al.  Generalized additive models for medical research , 1986, Statistical methods in medical research.

[2]  M. Abramowitz,et al.  Handbook of Mathematical Functions With Formulas, Graphs and Mathematical Tables (National Bureau of Standards Applied Mathematics Series No. 55) , 1965 .

[3]  T. Choi,et al.  Gaussian Process Regression Analysis for Functional Data , 2011 .

[4]  Michael I. Jordan Graphical Models , 2003 .

[5]  Richard H. Battin,et al.  Random Processes in Automatic Control. , 1957 .

[6]  Peter Green Linear models for field trials, smoothing and cross-validation , 1985 .

[7]  E. A. Sylvestre,et al.  Self Modeling Nonlinear Regression , 1972 .

[8]  Grace Wahba,et al.  Spline Models for Observational Data , 1990 .

[9]  S. Sundararajan,et al.  Predictive Approaches for Choosing Hyperparameters in Gaussian Processes , 1999, Neural Computation.

[10]  R. Ash,et al.  Topics in stochastic processes , 1975 .

[11]  A. A. Weiss,et al.  Semiparametric estimates of the relation between weather and electricity sales , 1986 .

[12]  W. Härdle Applied Nonparametric Regression , 1991 .

[13]  G. Wahba,et al.  A Correspondence Between Bayesian Estimation on Stochastic Processes and Smoothing by Splines , 1970 .

[14]  S. Janson Gaussian Hilbert Spaces , 1997 .

[15]  Daniel B. Hall,et al.  On the application of extended quasi‐likelihood to the clustered data case , 2001 .

[16]  Y. Vardi,et al.  From image deblurring to optimal investments : maximum likelihood solutions for positive linear inverse problems , 1993 .

[17]  P. Speckman Kernel smoothing in partial linear models , 1988 .

[18]  Raymond J. Carroll,et al.  Semiparametric regression for clustered data , 2001 .

[19]  D. Bates,et al.  Mixed-Effects Models in S and S-PLUS , 2001 .

[20]  David J. C. MacKay,et al.  Comparison of Approximate Methods for Handling Hyperparameters , 1999, Neural Computation.

[21]  Martin Crowder,et al.  Gaussian Estimation for Correlated Binomial Data , 1985 .

[22]  R. Eubank Nonparametric Regression and Spline Smoothing , 1999 .

[23]  Runze Li,et al.  New local estimation procedure for a non‐parametric regression function for longitudinal data , 2013, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[24]  R. Wolpert,et al.  Integrated likelihood methods for eliminating nuisance parameters , 1999 .

[25]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[26]  Yoav Ben-Shlomo,et al.  SITAR--a useful instrument for growth curve analysis. , 2010, International journal of epidemiology.

[27]  G. Robinson That BLUP is a Good Thing: The Estimation of Random Effects , 1991 .

[28]  T. Severini Integrated likelihood functions for non-Bayesian inference , 2007 .

[29]  Heping He,et al.  Integrated likelihood inference in semiparametric regression models , 2014 .

[30]  Jianqing Fan,et al.  Statistical Estimation in Varying-Coefficient Models , 1999 .

[31]  A. Seheult,et al.  Analysis of Field Experiments by Least Squares Smoothing , 1985 .

[32]  P. J. Huber The behavior of maximum likelihood estimates under nonstandard conditions , 1967 .

[33]  Thomas A. Severini,et al.  Extended Generalized Estimating Equations for Clustered Data , 1998 .

[34]  Anindya Roy,et al.  A note on the Bayes factor in a semiparametric regression model , 2009, J. Multivar. Anal..

[35]  Johan A. K. Suykens,et al.  Kernel Regression in the Presence of Correlated Errors , 2011, J. Mach. Learn. Res..

[36]  Christopher K. I. Williams Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond , 1999, Learning in Graphical Models.

[37]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[38]  Christopher K. I. Williams,et al.  Gaussian regression and optimal finite dimensional linear models , 1997 .

[39]  Matthias W. Seeger,et al.  Gaussian Processes For Machine Learning , 2004, Int. J. Neural Syst..

[40]  R. Tibshirani,et al.  Varying‐Coefficient Models , 1993 .

[41]  Nancy E. Heckman,et al.  Spline Smoothing in a Partly Linear Model , 1986 .

[42]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.