Finite mixtures of quantile and M-quantile regression models

In this paper we define a finite mixture of quantile and M-quantile regression models for heterogeneous and /or for dependent/clustered data. Components of the finite mixture represent clusters of individuals with homogeneous values of model parameters. For its flexibility and ease of estimation, the proposed approaches can be extended to random coefficients with a higher dimension than the simple random intercept case. Estimation of model parameters is obtained through maximum likelihood, by implementing an EM-type algorithm. The standard error estimates for model parameters are obtained using the inverse of the observed information matrix, derived through the Oakes (J R Stat Soc Ser B 61:479–482, 1999) formula in the M-quantile setting, and through nonparametric bootstrap in the quantile case. We present a large scale simulation study to analyse the practical behaviour of the proposed model and to evaluate the empirical performance of the proposed standard error estimates for model parameters. We considered a variety of empirical settings in both the random intercept and the random coefficient case. The proposed modelling approaches are also applied to two well-known datasets which give further insights on their empirical behaviour.

[1]  George G. Rhoads,et al.  Safety and Efficacy of Succimer in Toddlers with Blood Lead Levels of 20–44 μg/dL , 2000, Pediatric Research.

[2]  Robert Chambers,et al.  Small area estimation via M-quantile geographically weighted regression , 2012 .

[3]  Marco Geraci,et al.  Linear quantile mixed models , 2013, Statistics and Computing.

[4]  D. Oakes Direct calculation of the information matrix via the EM , 1999 .

[5]  P. J. Huber The behavior of maximum likelihood estimates under nonstandard conditions , 1967 .

[6]  Jens Breckling,et al.  A Measure of Production Performance , 1997 .

[7]  M. Bottai,et al.  Mixed-Effects Models for Conditional Quantiles with Longitudinal Data , 2009, The international journal of biostatistics.

[8]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[9]  Marco Alfò,et al.  Semiparametric Mixture Models for Multivariate Count Data, with Application , 2004 .

[10]  H. White A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity , 1980 .

[11]  Murat K. Munkin,et al.  Simulated maximum likelihood estimation of multivariate mixed‐Poisson regression models, with application , 1999 .

[12]  M. C. Jones Expectiles and M-quantiles are quantiles , 1994 .

[13]  K. Foster,et al.  A Dynamic Model of Investment in the U.S. Beef-Cattle Industry , 1992 .

[14]  N. Salvati,et al.  Asymptotic Properties and Variance Estimators of the M-quantile Regression Coefficients Estimators , 2015 .

[15]  Diane Lambert,et al.  Generalizing Logistic Regression by Nonparametric Mixing , 1989 .

[16]  R. Gueorguieva,et al.  A multivariate generalized linear mixed model for joint modelling of clustered outcomes in the exponential family , 2001 .

[17]  P. J. Huber Robust Regression: Asymptotics, Conjectures and Monte Carlo , 1973 .

[18]  C. S. Davis Semi-parametric and non-parametric methods for the analysis of repeated measurements with applications to clinical trials. , 1991, Statistics in medicine.

[19]  Peter J. Huber,et al.  Robust Statistics , 2005, Wiley Series in Probability and Statistics.

[20]  R. Koenker,et al.  Computing regression quantiles , 1987 .

[21]  N. Laird Nonparametric Maximum Likelihood Estimation of a Mixing Distribution , 1978 .

[22]  William H. Press,et al.  Numerical Recipes 3rd Edition: The Art of Scientific Computing , 2007 .

[23]  Irene A. Stegun,et al.  Handbook of Mathematical Functions. , 1966 .

[24]  Timo Schmid,et al.  Longitudinal analysis of the strengths and difficulties questionnaire scores of the Millennium Cohort Study children in England using M‐quantile random‐effects regression , 2015, Journal of the Royal Statistical Society. Series A,.

[25]  K. S. Kölbig,et al.  Errata: Milton Abramowitz and Irene A. Stegun, editors, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, National Bureau of Standards, Applied Mathematics Series, No. 55, U.S. Government Printing Office, Washington, D.C., 1994, and all known reprints , 1972 .

[26]  Christian Hennig,et al.  Identifiablity of Models for Clusterwise Linear Regression , 2000, J. Classif..

[27]  Alessio Farcomeni,et al.  Quantile regression for longitudinal data based on latent Markov subject-specific parameters , 2010, Statistics and Computing.

[28]  T. Louis Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .

[29]  Sin-Ho Jung Quasi-Likelihood for Median Regression Models , 1996 .

[30]  N. Tzavidis,et al.  M-quantile models for small area estimation , 2006 .

[31]  C. McCulloch Maximum Likelihood Variance Components Estimation for Binary Data , 1994 .

[32]  D. Ruppert,et al.  A Note on Computing Robust Regression Estimates via Iteratively Reweighted Least Squares , 1988 .

[33]  Dongmei Liu,et al.  Likelihood Estimation of Conjugacy Relationships in Linear Models with Applications to High-Throughput Genomics , 2009, The international journal of biostatistics.

[34]  John Hinde,et al.  Statistical Modelling in GLIM. , 1990 .

[35]  R. Koenker Quantile regression for longitudinal data , 2004 .

[36]  W. Newey,et al.  Asymmetric Least Squares Estimation and Testing , 1987 .

[37]  W. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[38]  W. Rogan,et al.  Treatment of Lead-Exposed Children (TLC) Trial: Effect of Succimer in Toddlers with Blood Leads of 20-44 µg/dl , 1999 .

[39]  Standard errors for EM estimates in generalized linear models with random effects. , 2000, Biometrics.

[40]  Z. Bai,et al.  Robust Estimation Using the Huber Function With a Data-Dependent Tuning Constant , 2007 .

[41]  M. Aitkin A General Maximum Likelihood Analysis of Variance Components in Generalized Linear Models , 1999, Biometrics.

[42]  W. DeSarbo,et al.  A maximum likelihood methodology for clusterwise linear regression , 1988 .

[43]  M. Puterman,et al.  Mixed Poisson regression models with covariate dependent rates. , 1996, Biometrics.

[44]  M. Bottai,et al.  Quantile regression for longitudinal data using the asymmetric Laplace distribution. , 2007, Biostatistics.

[45]  D. Bates,et al.  Approximations to the Log-Likelihood Function in the Nonlinear Mixed-Effects Model , 1995 .

[46]  Murray Aitkin,et al.  A hybrid EM/Gauss-Newton algorithm for maximum likelihood in mixture distributions , 1996, Stat. Comput..

[47]  Wolfgang Jank,et al.  Efficiency of Monte Carlo EM and Simulated Maximum Likelihood in Two-Stage Hierarchical Models , 2003 .

[48]  W. DeSarbo,et al.  A mixture likelihood approach for generalized linear models , 1995 .

[49]  Qing Liu,et al.  A note on Gauss—Hermite quadrature , 1994 .

[50]  S. Hosseinian,et al.  Robust inference for generalized linear models , 2009 .

[51]  Murray Aitkin,et al.  A general maximum likelihood analysis of overdispersion in generalized linear models , 1996, Stat. Comput..

[52]  M-quantile regression: diagnostics and parametric representation of the model , 2016 .

[53]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[54]  C. Geyer,et al.  Constrained Monte Carlo Maximum Likelihood for Dependent Data , 1992 .

[55]  R. Koenker,et al.  Regression Quantiles , 2007 .

[56]  S. Lipsitz,et al.  Quantile Regression Methods for Longitudinal Data with Drop‐outs: Application to CD4 Cell Counts of Patients Infected with the Human Immunodeficiency Virus , 1997 .