Consistent Order Estimation and Minimal Penalties

Consider an i.i.d. sequence of random variables whose distribution f* lies in one of the nested families of models Mq, q ≥ 1. The smallest index q* such that Mq* contains f* is called the model order. The aim of this paper is to explore the consistency properties of penalized likelihood model order estimators such as Bayesian information criterion. We show in a general setting that the minimal strongly consistent penalty is of order η(q)loglogn, where η(q) is a dimensional quantity. In contrast to previous work, an a priori upper bound on the model order is not assumed. The results rely on a sharp characterization of the pathwise fluctuations of the generalized likelihood ratio statistic under entropy assumptions on the model classes. Our results are applied to the geometrically complex problem of location mixture order estimation, which is widely used but poorly understood.

[1]  Xin Liu,et al.  Asymptotics for the likelihood ratio test in a two-component normal mixture model , 2004 .

[2]  R. Bass,et al.  Review: P. Billingsley, Convergence of probability measures , 1971 .

[3]  I. Csiszár,et al.  The consistency of the BIC Markov order estimator , 2000 .

[4]  Elisabeth Gassiat,et al.  The local geometry of finite mixtures , 2013 .

[5]  Aurélien Garivier,et al.  A minimum description length approach to hidden Markov models with Poisson and Gaussian emissions. Application to order identification , 2009 .

[6]  A. V. D. Vaart,et al.  Asymptotic Statistics: Frontmatter , 1998 .

[7]  Prakash Narayan,et al.  Order estimation and sequential universal data compression of a hidden Markov source by the method of mixtures , 1994, IEEE Trans. Inf. Theory.

[8]  Stéphane Boucheron,et al.  Optimal error exponents in hidden Markov models order estimation , 2003, IEEE Trans. Inf. Theory.

[9]  J. Hartigan A failure of likelihood asymptotics for normal mixtures , 1985 .

[10]  Y. Shao,et al.  Asymptotics for likelihood ratio tests under loss of identifiability , 2003 .

[11]  P. Bickel Asymptotic distribution of the likelihood ratio statistic in a prototypical non regular problem , 1993 .

[12]  R. Nishii Maximum likelihood principle and model selection when the true model is unspecified , 1988 .

[13]  S. Geer Applications of empirical process theory , 2000 .

[14]  David Williams,et al.  Probability with Martingales , 1991, Cambridge mathematical textbooks.

[15]  L. Wasserman,et al.  RATES OF CONVERGENCE FOR THE GAUSSIAN MIXTURE SIEVE , 2000 .

[16]  M. Ossiander,et al.  A Central Limit Theorem Under Metric Entropy with $L_2$ Bracketing , 1987 .

[17]  E. Gassiat Likelihood ratio inequalities with applications to various mixtures , 2002 .

[18]  E. Gassiat,et al.  The likelihood ratio test for the number of components in a mixture with Markov regime , 2000 .

[19]  THE LIKELIHOOD RATIO TEST FOR GENERAL MIXTURE MODELS WITH POSSIBLY STRUCTURAL PARAMETER , 2008 .

[20]  Jorma Rissanen,et al.  The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.

[21]  Ramon van Handel On the minimal penalty for Markov order estimation , 2009, ArXiv.

[22]  Haikady N. Nagaraja,et al.  Inference in Hidden Markov Models , 2006, Technometrics.

[23]  On comparison theorems , 1995 .

[24]  R. R. Bahadur,et al.  Statistics and probability : a Raghu Raj Bahadur festschrift , 1993 .

[25]  John C. Kieffer,et al.  Strongly consistent code-based identification and order estimation for constrained finite-state model classes , 1993, IEEE Trans. Inf. Theory.

[26]  Antoine Chambaz,et al.  Testing the order of a model , 2006 .

[27]  L. Finesso Consistent estimation of the order for Markov and hidden Markov chains , 1992 .

[28]  M. Ledoux,et al.  Comparison Theorems, Random Geometry and Some Limit Theorems for Empirical Processes , 1989 .

[29]  E. Hannan,et al.  The determination of optimum structures for the state space representation of multivariate stochastic processes , 1982 .

[30]  Aurélien Garivier,et al.  A mdl approach to hmm with Poisson and Gaussian emissions. Application to order identification , 2005 .

[31]  Imre Csiszár,et al.  Context tree estimation for not necessarily finite memory processes, via BIC and MDL , 2005, IEEE Transactions on Information Theory.

[32]  Imre Csiszár Large-scale typicality of Markov sample paths and consistency of MDL Order estimators , 2002, IEEE Trans. Inf. Theory.

[33]  J. Rissanen Stochastic Complexity and Modeling , 1986 .

[34]  Eric Moulines,et al.  Inference in Hidden Markov Models (Springer Series in Statistics) , 2005 .

[35]  B. G. Quinn,et al.  The determination of the order of an autoregression , 1979 .

[36]  Eric Moulines,et al.  Inference in hidden Markov models , 2010, Springer series in statistics.