Two-stage hierarchical modeling for analysis of subpopulations in conditional distributions

In this work, we develop the modeling and estimation approach for the analysis of cross-sectional clustered data with multimodal conditional distributions, where the main interest is in analysis of subpopulations. It is proposed to model such data in a hierarchical model with conditional distributions viewed as finite mixtures of normal components. With a large number of observations in the lowest level clusters, a two-stage estimation approach is used. In the first stage, the normal mixture parameters in each lowest level cluster are estimated using robust methods. Robust alternatives to the maximum-likelihood (ML) estimation are used to provide stable results even for data with conditional distributions such that their components may not quite meet normality assumptions. Then the lowest level cluster-specific means and standard deviations are modeled in a mixed effects model in the second stage. A small simulation study was conducted to compare performance of finite normal mixture population parameter estimates based on robust and ML estimation in stage 1. The proposed modeling approach is illustrated through the analysis of mice tendon fibril diameters data. Analyses results address genotype differences between corresponding components in the mixtures and demonstrate advantages of robust estimation in stage 1.

[1]  Marie Davidian,et al.  Robust two-stage approach to repeated measurements analysis of chronic ozone exposure in rats , 2003 .

[2]  M. C. Jones,et al.  Robust and efficient estimation by minimising a density power divergence , 1998 .

[3]  L. Soslowsky,et al.  Decorin regulates assembly of collagen fibrils and acquisition of biomechanical properties during tendon development , 2006, Journal of cellular biochemistry.

[4]  Inna Chervoneva,et al.  Differential Expression of Lumican and Fibromodulin Regulate Collagen Fibrillogenesis in Developing Mouse Tendons , 2000, The Journal of cell biology.

[5]  Andy H. Lee,et al.  Finite mixture regression model with random effects: application to neonatal hospital length of stay , 2003, Comput. Stat. Data Anal..

[6]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[7]  D. Jackson Collagens , 1978 .

[8]  Tingting Zhan,et al.  Generalized weighted likelihood density estimators with application to finite mixture of exponential family distributions , 2011, Comput. Stat. Data Anal..

[9]  O. Cordero-Braña,et al.  Minimum Hellinger Distance Estimation for Finite Mixture Models , 1996 .

[10]  J F Boisvieux,et al.  Alternative approaches to estimation of population pharmacokinetic parameters: comparison with the nonlinear mixed-effect model. , 1984, Drug metabolism reviews.

[11]  Angela Montanari,et al.  Model-based clustering of probability density functions , 2013, Adv. Data Anal. Classif..

[12]  M. Kenward,et al.  Small sample inference for fixed effects from restricted maximum likelihood. , 1997, Biometrics.

[13]  Shinto Eguchi,et al.  Robust estimation in the normal mixture model , 2006 .

[14]  W. R. Schucany,et al.  A Comparison of Minimum Distance and Maximum Likelihood Estimation of a Mixture Proportion , 1984 .

[15]  Ana Ivelisse Avilés,et al.  Linear Mixed Models for Longitudinal Data , 2001, Technometrics.

[16]  Richard D. De Veaux,et al.  Robust estimation of a normal mixture , 1990 .

[17]  David W. Scott,et al.  Parametric Statistical Modeling by Minimum Integrated Square Error , 2001, Technometrics.

[18]  T. Hyslop,et al.  A General Approach for Two‐Stage Analysis of Multilevel Clustered Non‐Gaussian Data , 2006, Biometrics.

[19]  E. Zycband,et al.  Collagen fibrillogenesis in situ: Fibril segments undergo post‐depositional modifications resulting in linear and lateral growth during matrix development , 1995, Developmental dynamics : an official publication of the American Association of Anatomists.

[20]  L. Skovgaard NONLINEAR MODELS FOR REPEATED MEASUREMENT DATA. , 1996 .

[21]  P. Nurmi Mixture Models , 2008 .

[22]  Marie Davidian,et al.  Some Simple Methods for Estimating Intraindividual Variability in Nonlinear Mixed Effects Models , 1993 .

[23]  T. Thompson,et al.  Finite mixture models with concomitant information: assessing diagnostic criteria for diabetes , 2002 .

[24]  R. Beran Minimum Hellinger distance estimates for parametric models , 1977 .

[25]  P. Deb Finite Mixture Models , 2008 .

[26]  M. Markatou Mixture Models, Robustness, and the Weighted Likelihood Methodology , 2000, Biometrics.

[27]  Marianthi Markatou,et al.  Weighted Likelihood Equations with Bootstrap Root Search , 1998 .

[28]  B. Lindsay,et al.  Minimum disparity estimation for continuous models: Efficiency, distributions and robustness , 1994 .

[29]  D. Birk,et al.  Collagens, Suprastructures, and Collagen Fibril Assembly , 2011 .