Bayesian multiple response kernel regression model for high dimensional data and its practical applications in near infrared spectroscopy

Non-linear regression based on reproducing kernel Hilbert space (RKHS) has recently become very popular in fitting high-dimensional data. The RKHS formulation provides an automatic dimension reduction of the covariates. This is particularly helpful when the number of covariates (p) far exceed the number of data points. In this paper, we introduce a Bayesian nonlinear multivariate regression model for high-dimensional problems. Our model is suitable when we have multiple correlated observed response corresponding to same set of covariates. We introduce a robust Bayesian support vector regression model based on a multivariate version of Vapnik's @e-insensitive loss function. The likelihood corresponding to the multivariate Vapnik's @e-insensitive loss function is constructed as a scale mixture of truncated normal and gamma distribution. The regression function is constructed using the finite representation of a function in the reproducing kernel Hilbert space (RKHS). The kernel parameter is estimated adaptively by assigning a prior on it and using the Markov chain Monte Carlo (MCMC) techniques for computation. Practical applications of our model are demonstrated via applications in near-infrared (NIR) spectroscopy and simulation studies. Our Bayesian kernel models are highly accurate in predicting composition of materials based on its near infrared (NIR) spectroscopy signature. We have compared our method with popularly used methodologies in NIR spectroscopy, like partial least square (PLS), principal component regression (PCA), support vector machine (SVM), Gaussian process regression (GPR), and random forest (RF). In all the simulation and real case studies, our multivariate Bayesian RKHS regression model outperforms the standard methods by a substantially large margin. The implementation of our models based on MCMC is fairly fast and straight forward.

[1]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[2]  Prem K. Goel,et al.  Bayesian latent variable regression via Gibbs sampling: methodology and practical aspects , 2007 .

[3]  Alexey L. Pomerantsev,et al.  Non‐linear regression analysis: new approach to traditional implementations , 2000 .

[4]  G. Wahba Support vector machines, reproducing kernel Hilbert spaces, and randomized GACV , 1999 .

[5]  Malay Ghosh,et al.  Bayesian nonlinear regression for large p small n problems , 2012, J. Multivar. Anal..

[6]  A. Zellner An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests for Aggregation Bias , 1962 .

[7]  Mordechai Jaeger,et al.  An application of a Bayesian approach to the combination of measurements of different accuracies , 1997 .

[8]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[9]  Junbin Gao,et al.  SVM regression through variational methods and its sequential implementation , 2003, Neurocomputing.

[10]  Xiaobo Zhou,et al.  Missing-value estimation using linear and non-linear regression with Bayesian gene selection , 2003, Bioinform..

[11]  Wei Chu,et al.  Bayesian support vector regression using a unified loss function , 2004, IEEE Transactions on Neural Networks.

[12]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[13]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[14]  J. Kalivas Two data sets of near infrared spectra , 1997 .

[15]  F. Jöbsis Noninvasive, infrared monitoring of cerebral and myocardial oxygen sufficiency and circulatory parameters. , 1977, Science.

[16]  T. Fearn,et al.  The choice of variables in multivariate regression: a non-conjugate Bayesian decision theory approach , 1999 .

[17]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[18]  G. Wahba Spline models for observational data , 1990 .

[19]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[20]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[21]  H. Chipman,et al.  BART: Bayesian Additive Regression Trees , 2008, 0806.3286.

[22]  G. Casella Empirical Bayes Gibbs sampling. , 2001, Biostatistics.

[23]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[24]  Refik Soyer,et al.  Bayesian Methods for Nonlinear Classification and Regression , 2004, Technometrics.

[25]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[26]  Martin Wolf,et al.  Progress of near-infrared spectroscopy and topography for brain and muscle clinical applications. , 2007, Journal of biomedical optics.

[27]  R. Kohn,et al.  Nonparametric seemingly unrelated regression , 2000 .

[28]  T. Fearn,et al.  Application of near infrared reflectance spectroscopy to the compositional analysis of biscuits and biscuit doughs , 1984 .

[29]  Edwin V. Bonilla,et al.  Multi-task Gaussian Process Prediction , 2007, NIPS.

[30]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[31]  Michael E. Tipping The Relevance Vector Machine , 1999, NIPS.

[32]  Peter Sollich,et al.  Bayesian Methods for Support Vector Machines: Evidence and Predictive Class Probabilities , 2002, Machine Learning.

[33]  P. J. Huber Robust Estimation of a Location Parameter , 1964 .

[34]  Peter Guttorp,et al.  Multivariate receptor models and model uncertainty , 2002 .

[35]  S. Chib,et al.  Understanding the Metropolis-Hastings Algorithm , 1995 .

[36]  G. Wahba,et al.  Some results on Tchebycheffian spline functions , 1971 .

[37]  T. Fearn,et al.  Bayesian Wavelet Regression on Curves With Application to a Spectroscopic Calibration Problem , 2001 .