Semiparametric Mixture Models for Multivariate Count Data, with Application

The analysis of overdispersed counts has been the focus of a wide range of literature, with the general objective of providing reliable parameter estimates in the presence of heterogeneity or dependence among subjects. In this paper we extend the standard variance component models to the analysis of multivariate counts, defining the dependence among counts through a set of correlated random coefficients. Estimation is carried out by numerical integration through an EM algorithm without parametric assumptions upon the random coefficients distribution. The proposed model is computationally parsimonious and, when applied to a real dataset, seems to produce better results than parametric models. A simulation study has been carried out to investigate the behaviour of the proposed models in a series of empirical situations. Copyright Royal Economic Socciety 2004

[1]  John M. Olin Markov Chain Monte Carlo Analysis of Correlated Count Data , 2003 .

[2]  Sylvia Richardson,et al.  Bayesian mapping of disease , 1995 .

[3]  L Knorr-Held,et al.  Bayesian Detection of Clusters and Discontinuities in Disease Maps , 2000, Biometrics.

[4]  Kurt Brännäs,et al.  Semiparametric estimation of heterogeneous count data models , 1994 .

[5]  Murray Aitkin,et al.  A general maximum likelihood analysis of overdispersion in generalized linear models , 1996, Stat. Comput..

[6]  Pravin K. Trivedi,et al.  Regression Analysis of Count Data , 1998 .

[7]  Richard B. Davies,et al.  Nonparametric control for residual heterogeneity in modelling recurrent behavior , 1993 .

[8]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[9]  B. Lindsay The Geometry of Mixture Likelihoods: A General Theory , 1983 .

[10]  M. Puterman,et al.  Mixed Poisson regression models with covariate dependent rates. , 1996, Biometrics.

[11]  A. W. Kemp,et al.  Univariate Discrete Distributions , 1993 .

[12]  Dankmar Böhning,et al.  The EM algorithm with gradient function update for discrete mixtures with known (fixed) number of components , 2003, Stat. Comput..

[13]  Rainer Winkelmann,et al.  Two aspects of labor mobility: A bivariate Poisson regression approach , 1993 .

[14]  A. Cameron,et al.  Tests of Independence in parametric Models With Applications and Illustrations , 1993 .

[15]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[16]  R. Dersimonian Maximum Likelihood Estimation of a Mixing Distribution , 1986 .

[17]  Bruce G. Lindsay,et al.  A review of semiparametric mixture models , 1995 .

[18]  A. Colin Cameron,et al.  Bivariate Count Data Regression Using Series Expansions: With Applications , 1997 .

[19]  M. Aitkin A General Maximum Likelihood Analysis of Variance Components in Generalized Linear Models , 1999, Biometrics.

[20]  Dankmar Böhning,et al.  Computer-Assisted Analysis of Mixtures and Applications , 2000, Technometrics.

[21]  Christophe Biernacki,et al.  Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models , 2003, Comput. Stat. Data Anal..

[22]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[23]  A. Cameron,et al.  A Microeconometric Model of the Demand for Health Care and Health Insurance in Australia , 1988 .

[24]  Rainer Winkelmann,et al.  Seemingly Unrelated Negative Binomial Regression , 2000 .

[25]  Murat K. Munkin,et al.  Simulated maximum likelihood estimation of multivariate mixed‐Poisson regression models, with application , 1999 .

[26]  B. Lindsay The Geometry of Mixture Likelihoods, Part II: The Exponential Family , 1983 .

[27]  Luca Tardella,et al.  A geometric approach to transdimensional markov chain monte carlo , 2003 .

[28]  F. Windmeijer,et al.  An R-squared measure of goodness of fit for some common nonlinear regression models , 1997 .

[29]  D. Karlis An EM algorithm for multivariate Poisson distribution and related models , 2003 .

[30]  A. Molli'e Bayesian mapping of disease , 1996 .

[31]  J. Heckman,et al.  A Method for Minimizing the Impact of Distributional Assumptions in Econometric Models for Duration Data , 1984 .

[32]  J. Neyman,et al.  Consistent Estimates Based on Partially Consistent Observations , 1948 .

[33]  Z. Griliches,et al.  Econometric Models for Count Data with an Application to the Patents-R&D Relationship , 1984 .

[34]  John Hinde,et al.  Statistical Modelling in GLIM. , 1989 .

[35]  Dankmar Böhning,et al.  Computer-Assisted Analysis of Mixtures and Applications: Meta-Analysis, Disease Mapping, and Others , 1999 .

[36]  H. Goldstein,et al.  Multivariate spatial models for event data. , 2000, Statistics in medicine.

[37]  J. T. Wulu,et al.  Regression analysis of count data , 2002 .

[38]  Hans van Ophem,et al.  A GENERAL METHOD TO ESTIMATE CORRELATED DISCRETE RANDOM VARIABLES , 1999, Econometric Theory.

[39]  Shiferaw Gurmu,et al.  Generalized bivariate count data regression models , 2000 .

[40]  Risto Lehtonen,et al.  Multilevel Statistical Models , 2005 .

[41]  R. Rocci,et al.  Industry and time specific deviations from fundamental values in a random coefficient model , 2007 .

[42]  N. Laird Nonparametric Maximum Likelihood Estimation of a Mixing Distribution , 1978 .

[43]  E. Crouch,et al.  The Evaluation of Integrals of the form ∫+∞ −∞ f(t)exp(−t 2) dt: Application to Logistic-Normal Models , 1990 .

[44]  C. Geyer,et al.  Constrained Monte Carlo Maximum Likelihood for Dependent Data , 1992 .

[45]  C. McCulloch Maximum Likelihood Variance Components Estimation for Binary Data , 1994 .

[46]  R. Gueorguieva,et al.  A multivariate generalized linear mixed model for joint modelling of clustered outcomes in the exponential family , 2001 .

[47]  Luca Tardella,et al.  A geometric approach to transdimensional MCMC , 2003 .

[48]  G. McLachlan On Bootstrapping the Likelihood Ratio Test Statistic for the Number of Components in a Normal Mixture , 1987 .

[49]  S. Kocherlakota,et al.  Bivariate discrete distributions , 1992 .

[50]  J. MacKinnon,et al.  Estimation and inference in econometrics , 1994 .

[51]  H. Friedl Econometric Analysis of Count Data , 2002 .