Bayesian density estimation from grouped continuous data

Grouped data occur frequently in practice, either because of limited resolution of instruments, or because data have been summarized in relatively wide bins. A combination of the composite link model with roughness penalties is proposed to estimate smooth densities from such data in a Bayesian framework. A simulation study is used to evaluate the performances of the strategy in the estimation of a density, of its quantiles and first moments. Two illustrations are presented: the first one involves grouped data of lead concentration in the blood and the second one the number of deaths due to tuberculosis in The Netherlands in wide age classes.

[1]  Bayesian Smoothing and Regression Splines for MeasurementError , 2000 .

[2]  T. Duchesne,et al.  Local likelihood density estimation for interval censored data , 2005 .

[3]  Philippe Lambert,et al.  Archimedean copula estimation using Bayesian splines smoothing techniques , 2007, Comput. Stat. Data Anal..

[4]  P. Eilers,et al.  Bayesian proportional hazards model with time‐varying regression coefficients: a penalized Poisson regression approach , 2005, Statistics in medicine.

[5]  J. Rosenthal,et al.  On adaptive Markov chain Monte Carlo algorithms , 2005 .

[6]  Philippe Lambert,et al.  Robust specification of the roughness penalty prior distribution in spatially adaptive Bayesian P-splines models , 2007, Comput. Stat. Data Anal..

[7]  H. Haario,et al.  An adaptive Metropolis algorithm , 2001 .

[8]  R. Tweedie,et al.  Exponential convergence of Langevin distributions and their discrete approximations , 1996 .

[9]  Warren Galke,et al.  Analysis of Coarsely Grouped Data from the Lognormal Distribution , 1980 .

[10]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[11]  Andreas Brezger,et al.  Generalized structured additive regression based on Bayesian P-splines , 2006, Comput. Stat. Data Anal..

[12]  Dani Gamerman,et al.  Sampling from the posterior distribution in generalized linear mixed models , 1997, Stat. Comput..

[13]  J. Rosenthal,et al.  Optimal scaling of discrete approximations to Langevin diffusions , 1998 .

[14]  S. Lang,et al.  Bayesian P-Splines , 2004 .

[15]  Paul H. C. Eilers,et al.  Ill-posed problems with counts, the composite link model and penalized likelihood , 2007 .

[16]  M. C. Jones,et al.  A reliable data-based bandwidth selection method for kernel density estimation , 1991 .

[17]  Robin Thompson,et al.  Composite Link Functions in Generalized Linear Models , 1981 .

[18]  David G. Kendall,et al.  Spline Transformations: Three New Diagnostic Aids for the Statistical Data‐Analyst , 1971 .

[19]  Stanley P. Azen,et al.  Computational Statistics and Data Analysis (CSDA) , 2006 .

[20]  Scott M. Berry,et al.  Bayesian Smoothing and Regression Splines for Measurement Error Problems , 2002 .