Testing Homogeneity in a Mixture Distribution via the L2 Distance Between Competing Models

Ascertaining the number of components in a mixture distribution is an interesting and challenging problem for statisticians. Chen, Chen, and Kalbfleisch recently proposed a modified likelihood ratio test (MLRT), which is distribution-free and locally most powerful, asymptotically. In this article we present a new method for testing whether a finite mixture distribution is homogeneous. Our method, the D test, is based on the L2 distance between a fitted homogeneous model and a fitted heterogeneous model. For mixture components from standard parametric families, the D-test statistic has a closed-form expression in terms of parameter estimators, whereas likelihood ratio-type test statistics do not; the latter test statistics are nontrivial functions of both the parameter estimators and the full dataset. The convergence rates of the D-test statistic under a null hypothesis of homogeneity and an alternative hypothesis of heterogeneity are established. The D test is shown to be competitive with the MLRT when the mixture components come from a normal location family. However, in the exponential scale and normal location/scale cases, the relative performances of the D test and the MLRT are mixed. In cases such as these two, we propose to use a weighted D test, in which the measure underlying the L2 distance is changed to accentuate the disparities between the homogeneous and heterogeneous models. Changing the measure is equivalent to computing the D-test statistic using a weighting function or to transforming the data before conducting the D test. Appropriately weighted D tests are competitive in both the exponential scale and normal location/scale cases. After applying the D test to a dataset in which the observations are measurements of firms' financial performances, we conclude with discussion and remarks.

[1]  J. Hartigan A failure of likelihood asymptotics for normal mixtures , 1985 .

[2]  Jiayang Sun Tail probabilities of the maxima of Gaussian random fields , 1993 .

[3]  Lancelot F. James,et al.  Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions , 2001 .

[4]  P. Sen,et al.  On the asymptotic performance of the log likelihood ratio statistic for the mixture model and related results , 1984 .

[5]  D. Rubin,et al.  Testing the number of components in a normal mixture , 2001 .

[6]  B. Lindsay Mixture models : theory, geometry, and applications , 1995 .

[7]  Eric S. Lander,et al.  Asymptotic distribution of the likelihood ratio test that a mixture of two binomials is a single binomial , 1995 .

[8]  E. Lehmann Elements of large-sample theory , 1998 .

[9]  Jiahua Chen,et al.  The likelihood ratio test for homogeneity in finite mixture models , 2001 .

[10]  C. R. Rao,et al.  The Utilization of Multiple Measurements in Problems of Biological Classification , 1948 .

[11]  A. Cohen,et al.  Finite Mixture Distributions , 1982 .

[12]  Edward I. George,et al.  Bayesian Model Selection , 2006 .

[13]  G. McLachlan On Bootstrapping the Likelihood Ratio Test Statistic for the Number of Components in a Normal Mixture , 1987 .

[14]  Mohamed Lemdani,et al.  Likelihood ratio tests in contamination models , 1999 .

[15]  J. Ott Analysis of Human Genetic Linkage , 1985 .

[16]  J. Kalbfleisch,et al.  A modified likelihood ratio test for homogeneity in finite mixture models , 2001 .

[17]  L. Brown Fundamentals of statistical exponential families: with applications in statistical decision theory , 1986 .

[18]  N. Kiefer Discrete Parameter Variation: Efficient Estimation of a Switching Regression Model , 1978 .

[19]  Bernard Garel,et al.  Likelihood ratio test for univariate Gaussian mixture , 2001 .

[20]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[21]  R. Davies Hypothesis testing when a nuisance parameter is present only under the alternative , 1977 .

[22]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[23]  P. Bickel Asymptotic distribution of the likelihood ratio statistic in a prototypical non regular problem , 1993 .

[24]  O. Pons,et al.  Likelihood ratio tests for genetic linkage , 1997 .

[25]  B. Lindsay Moment Matrices: Applications in Mixtures , 1989 .