L2E estimation of mixture complexity for count data

For count data, robust estimation of the number of mixture components in finite mixtures is revisited using L"2 distance. An information criterion based on L"2 distance is shown to yield an estimator, which is also shown to be strongly consistent. Monte Carlo simulations show that our estimator is competitive with other procedures in correctly determining the number of components when the data comes from Poisson mixtures. When the data comes from a negative binomial mixture but the postulated model is a Poisson mixture, simulations show that our estimator is highly competitive with the minimum Hellinger distance (MHD) estimator in terms of robustness against model misspecification. Furthermore, we illustrate the performance of our estimator for a real dataset with overdispersion and zero-inflation. Computational simplicity combined with robustness property makes the L"2E approach an attractive alternative to other procedures in the literature.

[1]  M. C. Jones,et al.  Robust and efficient estimation by minimising a density power divergence , 1998 .

[2]  P Schlattmann,et al.  Mixture models and disease mapping. , 1993, Statistics in medicine.

[3]  Montserrat Guillén,et al.  Count data models for a credit scoring system , 1996 .

[4]  P. Deb,et al.  Demand for Medical Care by the Elderly: A Finite Mixture Approach , 1997 .

[5]  T. N. Sriram,et al.  Robust estimation of mixture complexity for count data , 2007, Comput. Stat. Data Anal..

[6]  T. N. Sriram,et al.  Robust Estimation of Mixture Complexity , 2006 .

[7]  D. Karlis,et al.  Robust Inference for Finite Poisson Mixtures , 2001 .

[8]  O. Cordero-Braña,et al.  Minimum Hellinger Distance Estimation for Finite Mixture Models , 1996 .

[9]  Ross D. Shachter,et al.  Three Approaches to Probability Model Selection , 1994, UAI.

[10]  R. Beran Minimum Hellinger distance estimates for parametric models , 1977 .

[11]  L. Devroye,et al.  Nonparametric Density Estimation: The L 1 View. , 1985 .

[12]  Jiahua Chen,et al.  Order Selection in Finite Mixture Models With a Nonsmooth Penalty , 2008 .

[13]  D. W. Scott Outlier Detection and Clustering by Partial Mixture Modeling , 2004 .

[14]  Dimitris Karlis,et al.  On Testing for the Number of Components in a Mixed Poisson Model , 1999 .

[15]  Lancelot F. James,et al.  Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions , 2001 .

[16]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[17]  D K Pauler,et al.  Mixture models for eye-tracking data: a case study. , 1996, Statistics in medicine.

[18]  Sanford Weisberg,et al.  Computing science and statistics : proceedings of the 30th Symposium on the Interface, Minneapolis, Minnesota, May 13-16, 1998 : dimension reduction, computational complexity and information , 1998 .

[19]  D. Karlis,et al.  Minimum Hellinger Distance Estimation for Poisson Mixtures , 1998 .

[20]  David W. Scott,et al.  Parametric Statistical Modeling by Minimum Integrated Square Error , 2001, Technometrics.

[21]  Lancelot F. James,et al.  Consistent estimation of mixture complexity , 2001 .

[22]  L. Devroye,et al.  Nonparametric density estimation : the L[1] view , 1987 .

[23]  K. Roeder A Graphical Technique for Determining the Number of Components in a Mixture of Normals , 1994 .