Hypothesis testing for finite mixture models

Hypothesis testing for finite mixture model has long been a challenging problem. The standard likelihood ratio test (LRT) does not have the usual asymptotic χ2 distribution partly because the mixture model is not identifiable under null hypothesis. A simple class of hypothesis test procedures for finite mixture models based on goodness of fit (GOF) test statistics is investigated. The suggested hypothesis test procedure is easy to understand and use and can be applied to many mixture models with continuous data. Five commonly used goodness of fit test statistics are considered and compared. The limit distribution of test statistics is simulated based on the bootstrap method. It is demonstrated that a simple application of GOF test statistics to finite mixture models can provide comparable or even superior hypothesis test performance compared to the existing cutting edge EM test method through extensive simulation studies. The effectiveness of GOF test to choose the number components is also demonstrated based on limited empirical studies and a real data application.

[1]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[2]  N. Kuiper Tests concerning random points on a circle , 1960 .

[3]  B. Lindsay Mixture models : theory, geometry, and applications , 1995 .

[4]  Jiahua Chen,et al.  INFERENCE FOR NORMAL MIXTURES IN MEAN AND VARIANCE , 2008 .

[5]  G. S. Watson,et al.  Goodness-of-fit tests on a circle. II , 1961 .

[6]  P. Bickel Asymptotic distribution of the likelihood ratio statistic in a prototypical non regular problem , 1993 .

[7]  T. W. Anderson,et al.  Asymptotic Theory of Certain "Goodness of Fit" Criteria Based on Stochastic Processes , 1952 .

[8]  J. Hartigan A failure of likelihood asymptotics for normal mixtures , 1985 .

[9]  G. Jogesh Babu,et al.  Goodness-of-fit tests when parameters are estimated , 2004 .

[10]  Jiahua Chen,et al.  Inference on the Order of a Normal Mixture , 2012 .

[11]  Xuming He,et al.  Inference for Subgroup Analysis With a Structured Logistic-Normal Mixture Model , 2015 .

[12]  Weixin Yao,et al.  A profile likelihood method for normal mixture with unequal variance , 2010 .

[13]  R. Hathaway A Constrained Formulation of Maximum-Likelihood Estimation for Normal Mixture Distributions , 1985 .

[14]  Jean-Marc Azaïs,et al.  The likelihood ratio test for general mixture models with or without structural parameter , 2009 .

[15]  John D. Kalbfleisch,et al.  Modified likelihood ratio test in finite mixture models with a structural parameter , 2005 .

[16]  Tsung-I Lin,et al.  Finite mixture modelling using the skew normal distribution , 2007 .

[17]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[18]  Y. Shao,et al.  Asymptotics for likelihood ratio tests under loss of identifiability , 2003 .

[19]  Pengfei Li,et al.  Testing the Order of a Finite Mixture , 2010 .

[20]  H. Kasahara,et al.  Testing the Number of Components in Normal Mixture Regression Models , 2015 .

[21]  E. Gassiat,et al.  Testing the order of a model using locally conic parametrization : population mixtures and stationary ARMA processes , 1999 .

[22]  Jiahua Chen,et al.  Hypothesis test for normal mixture models: The EM approach , 2009, 0908.3428.

[23]  Y. Bechtel,et al.  A population and family study N‐acetyltransferase using caffeine urinary metabolites , 1993, Clinical pharmacology and therapeutics.

[24]  Hongtu Zhu,et al.  Hypothesis testing in mixture regression models , 2004 .

[25]  Dankmar Böhning,et al.  Computer-Assisted Analysis of Mixtures and Applications , 2000, Technometrics.