Bias in misspecified mixtures.

A finite mixture is a distribution where a given observation can come from any of a finite set of components. That is, the density of the random variable X is of the form f(x) = pi 1f1(x) + pi 2f2(x) + ... + pi kfk(x), where the pi i are the mixing proportions and the fi are the component densities. Mixture models are common in many areas of biology; the most commonly applied is a mixture of normal densities. Many of the problems with inference in the mixture setting are well known. Not so well documented, however, are the extreme biases that can occur in the maximum likelihood estimators (MLEs) when there is model misspecification. This paper shows that even the seemingly innocuous assumption of equal variances for the components of the mixture can lead to surprisingly large asymptotic biases in the MLEs of the parameters. Assuming normality when the underlying distributions are skewed can also lead to strong biases. We explicitly calculate the asymptotic biases when maximum likelihood is carried out assuming normality for several types of true underlying distribution. If the true distribution is a mixture of skewed components, then an application of the Box-Cox power transformation can reduce the asymptotic bias substantially. The power lambda in the Box-Cox transformation is in this case treated as an additional parameter to be estimated. In many cases the bias can be reduced to acceptable levels, thus leading to meaningful inference. A modest Monte Carlo study gives an indication of the small-sample performance of inference procedures (including the power and level of likelihood ratio tests) based on a likelihood that incorporates estimation of lambda. A real data example illustrates the method.

[1]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[2]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[3]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[4]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[5]  K. Pearson Contributions to the Mathematical Theory of Evolution , 1894 .

[6]  J. Hartigan A failure of likelihood asymptotics for normal mixtures , 1985 .

[7]  R. Jennrich Asymptotic Properties of Non-Linear Least Squares Estimators , 1969 .

[8]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[9]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[10]  D. N. Geary Mixture Models: Inference and Applications to Clustering , 1989 .

[11]  P. Sen,et al.  On the asymptotic performance of the log likelihood ratio statistic for the mixture model and related results , 1984 .

[12]  Edward B. Fowlkes,et al.  Some Methods for Studying the Mixture of Two Normal (Lognormal) Distributions , 1979 .

[13]  Donald B. Rubin,et al.  Max-imum Likelihood from Incomplete Data , 1972 .

[14]  Richard A. Johnson,et al.  The Large-Sample Behavior of Transformations to Normality , 1980 .

[15]  N E Morton,et al.  Skewness in commingled distributions. , 1976, Biometrics.

[16]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[17]  M. S. Smith,et al.  A FIELD TRIAL TO DETERMINE THE FEASIBILITY OF DELIVERING ORAL VACCINES TO WILD SWINE , 1990, Journal of wildlife diseases.