Minimax Adaptive Estimation of Nonparametric Hidden Markov Models

We consider stationary hidden Markov models with finite state space and nonparametric modeling of the emission distributions. It has remained unknown until very recently that such models are identifiable. In this paper, we propose a new penalized least-squares estimator for the emission distributions which is statistically optimal and practically tractable. We prove a non asymptotic oracle inequality for our nonparametric estimator of the emission distributions. A consequence is that this new estimator is rate minimax adaptive up to a logarithmic term. Our methodology is based on projections of the emission distributions onto nested subspaces of increasing complexity. The popular spectral estimators are unable to achieve the optimal rate but may be used as initial points in our procedure. Simulations are given that show the improvement obtained when applying the least-squares minimization consecutively to the spectral estimation.

[1]  P. Massart,et al.  Concentration inequalities and model selection , 2007 .

[2]  George G. Lorentz,et al.  Constructive Approximation , 1993, Grundlehren der mathematischen Wissenschaften.

[3]  Martin F. Lambert,et al.  A non-parametric hidden Markov model for climate state identification , 2003 .

[4]  Anima Anandkumar,et al.  A Method of Moments for Mixture Models and Hidden Markov Models , 2012, COLT.

[5]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[6]  Stéphane Robin,et al.  Inference in finite state space non parametric Hidden Markov Models and applications , 2016, Stat. Comput..

[7]  Luc Lehéricy,et al.  Order estimation for non-parametric Hidden Markov Models , 2016 .

[8]  E. Vernet Posterior consistency for nonparametric hidden Markov models with finite state space , 2013, 1311.3092.

[9]  J. Schur Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen. , 1911 .

[10]  C. Yau,et al.  Bayesian non‐parametric hidden Markov models with applications in genomics , 2011 .

[11]  Le Song,et al.  Nonparametric Estimation of Multi-View Latent Variable Models , 2013, ICML.

[12]  Dean Alderucci A SPECTRAL ALGORITHM FOR LEARNING HIDDEN MARKOV MODELS THAT HAVE SILENT STATES , 2015 .

[13]  Lifeng Shang,et al.  Nonparametric discriminant HMM and application to facial expression recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Stéphane Robin,et al.  Hidden Markov Models with mixtures as emission distributions , 2012, Statistics and Computing.

[15]  D. Paulin Concentration inequalities for Markov chains by Marton couplings and spectral methods , 2012, 1212.2015.

[16]  Elisabeth Gassiat,et al.  Finite state space non parametric Hidden Markov Models are in general identifiable , 2013, 1306.4657.

[17]  Fabrice Lefèvre,et al.  Non-parametric probability estimation for HMM-based automatic speech recognition , 2003, Comput. Speech Lang..

[18]  P. Wedin Perturbation bounds in connection with singular value decomposition , 1972 .

[19]  Yohann De Castro,et al.  Consistent Estimation of the Filtering and Marginal Smoothing Distributions in Nonparametric Hidden Markov Models , 2015, IEEE Transactions on Information Theory.

[20]  Dominique Bontemps,et al.  Clustering and variable selection for categorical multivariate data , 2010, 1002.1142.

[21]  H. Holzmann,et al.  Nonparametric identification of hidden Markov models , 2014 .

[22]  Judith Rousseau,et al.  Nonparametric finite translation hidden Markov models and extensions , 2016 .

[23]  Jean-Marc Robin,et al.  Estimating Multivariate Latent-Structure Models , 2016, 1603.09141.

[24]  C. Matias,et al.  Identifiability of parameters in latent structure models with many observed variables , 2008, 0809.5032.

[25]  Laurent Couvreur,et al.  Wavelet-based non-parametric HMM's: theory and applications , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[26]  Bertrand Michel,et al.  Slope heuristics for variable selection and clustering via Gaussian mixtures , 2008 .

[27]  G. Stewart,et al.  Matrix Perturbation Theory , 1990 .

[28]  Thierry Dumont,et al.  Nonparametric regression on hidden phi-mixing variables: identifiability and consistency of a pseudo-likelihood based estimation procedure , 2012, 1209.0633.

[29]  Bertrand Michel,et al.  Slope heuristics: overview and implementation , 2011, Statistics and Computing.