Applications of Hybrid Monte Carlo to Bayesian Generalized Linear Models: Quasicomplete Separation and Neural Networks

Abstract The “leapfrog” hybrid Monte Carlo algorithm is a simple and effective MCMC method for fitting Bayesian generalized linear models with canonical link. The algorithm leads to large trajectories over the posterior and a rapidly mixing Markov chain, having superior performance over conventional methods in difficult problems like logistic regression with quasicomplete separation. This method offers a very attractive solution to this common problem, providing a method for identifying datasets that are quasicomplete separated, and for identifying the covariates that are at the root of the problem. The method is also quite successful in fitting generalized linear models in which the link function is extended to include a feedforward neural network. With a large number of hidden units, however, or when the dataset becomes large, the computations required in calculating the gradient in each trajectory can become very demanding. In this case, it is best to mix the algorithm with multivariate random walk Met...

[1]  J. D. Doll,et al.  Brownian dynamics as smart Monte Carlo simulation , 1978 .

[2]  M. Bershadsky On stochastic quantization , 1984 .

[3]  A. Albert,et al.  On the existence of maximum likelihood estimates in logistic regression models , 1984 .

[4]  S. Duane Stochastic quantization versus the microcanonical ensemble: Getting the best of both worlds , 1985 .

[5]  Thomas J. Santner,et al.  A note on A. Albert and J. A. Anderson's conditions for the existence of maximum likelihood estimates in logistic regression models , 1986 .

[6]  S. Duane,et al.  Hybrid Monte Carlo , 1987 .

[7]  N. Jaffe,et al.  Weekly high-dose methotrexate and doxorubicin for osteosarcoma: the Dana-Farber Cancer Institute/the Children's Hospital--study III. , 1987, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[8]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[9]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[10]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[11]  A. Kennedy The theory of hybrid stochastic algorithms , 1990 .

[12]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[13]  A. Agresti An introduction to categorical data analysis , 1997 .

[14]  Scott L. Zeger,et al.  Generalized linear models with random e ects: a Gibbs sampling approach , 1991 .

[15]  A. Horowitz A generalized guided Monte Carlo algorithm , 1991 .

[16]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[17]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[18]  Héctor J. Sussmann,et al.  Uniqueness of the weights for minimal feedforward nets with a given input-output map , 1992, Neural Networks.

[19]  Radford M. Neal An improved acceptance procedure for the hybrid Monte Carlo algorithm , 1992, hep-lat/9208011.

[20]  Timothy Masters,et al.  Multilayer Feedforward Networks , 1993 .

[21]  Adrian F. M. Smith,et al.  Bayesian Inference for Generalized Linear and Proportional Hazards Models Via Gibbs Sampling , 1993 .

[22]  Bani K. Mallick,et al.  Generalized linear models with unknown link functions , 1994 .

[23]  Nitin R. Patel,et al.  Exact logistic regression: theory and examples. , 1995, Statistics in medicine.

[24]  W. Gilks,et al.  Adaptive Rejection Metropolis Sampling Within Gibbs Sampling , 1995 .

[25]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[26]  A. Raftery Approximate Bayes factors and accounting for model uncertainty in generalised linear models , 1996 .

[27]  J. T. Hwang,et al.  Prediction Intervals for Artificial Neural Networks , 1997 .

[28]  P Gustafson,et al.  Large hierarchical Bayesian analysis of multivariate survival data. , 1997, Biometrics.

[29]  Michael J. Daniels Computing Posterior Distributions for Covariance Matrices , 1998 .

[30]  J. Rosenthal,et al.  Optimal scaling of discrete approximations to Langevin diffusions , 1998 .

[31]  Richard D. De Veaux,et al.  Estimating Prediction Intervals for Arti cial Neural Networks , 2022 .