Bayesian deviance the e ective number of parameters and the comparison of arbitrarily complex models

We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly de ned We follow Dempster in examining the posterior distribution of the log likelihood under each model from which we derive measures of t and complexity the e ective number of parameters These may be combined into a Deviance Information Criterion DIC which is shown to have an approximate decision theoretic justi cation Ana lytic and asymptotic identities reveal the measure of complexity to be a generalisation of a wide range of previous suggestions with particular reference to the neural network literature The contributions of individual observations to t and complexity can give rise to a diagnostic plot of deviance residuals against leverages The procedure is illustrated in a number of examples and throughout it is emphasised that the required quantities are trivial to compute in a Markov chain Monte Carlo analysis and require no analytic work for new models

[1]  Scott L. Zeger,et al.  Generalized linear models with random e ects: a Gibbs sampling approach , 1991 .

[2]  D. Clayton,et al.  Empirical Bayes estimates of age-standardized relative risks for use in disease mapping. , 1987, Biometrics.

[3]  Martin Crowder,et al.  Beta-binomial Anova for Proportions , 1978 .

[4]  Jianming Ye On Measuring and Correcting the Effects of Data Mining and Model Selection , 1998 .

[5]  Alan E. Gelfand,et al.  Model choice: A minimum posterior predictive loss approach , 1998, AISTATS.

[6]  J. Bernardo Expected Information as Expected Utility , 1979 .

[7]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[8]  David Mackay,et al.  Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks , 1995 .

[9]  Murray Aitkin The calibration of P-values, posterior Bayes factors and the AIC from the posterior distribution of the likelihood , 1997, Stat. Comput..

[10]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[11]  E. D. Rest,et al.  Statistical Theory and Methodology in Science and Engineering , 1963 .

[12]  Adrian E. Raftery,et al.  Bayes factors and model uncertainty , 1995 .

[13]  D. Spiegelhalter,et al.  Bayesian Analysis of Realistically Complex Models , 1996 .

[14]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[15]  D. Lindley,et al.  Bayes Estimates for the Linear Model , 1972 .

[16]  A. P. DEMPSTER,et al.  Commentary on the paper by Murray Aitkin, and on discussion by Mervyn Stone , 1997, Stat. Comput..

[17]  D. Pauler The Schwarz criterion and related methods for normal linear models , 1998 .

[18]  John E. Moody,et al.  The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.

[19]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[20]  Purushottam W. Laud,et al.  Predictive Model Selection , 1995 .

[21]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[22]  Arthur P. Dempster,et al.  The direct use of likelihood for significance testing , 1997, Stat. Comput..

[23]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[24]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[25]  Shun-ichi Amari,et al.  Network information criterion-determining the number of hidden units for an artificial neural network model , 1994, IEEE Trans. Neural Networks.

[26]  G. C. Tiao,et al.  Bayesian inference in statistical analysis , 1973 .

[27]  A. Gelfand,et al.  Bayesian Model Choice: Asymptotics and Exact Calculations , 1994 .

[28]  Bradley P. Carlin,et al.  BAYES AND EMPIRICAL BAYES METHODS FOR DATA ANALYSIS , 1996, Stat. Comput..

[29]  N. Laird,et al.  A likelihood-based method for analysing longitudinal binary responses , 1993 .

[30]  S. Chib,et al.  Analysis of multivariate probit models , 1998 .