Inference in hidden Markov models

This book is a comprehensive treatment of inference for hidden Markov models, including both algorithms and statistical theory. Topics range from filtering and smoothing of the hidden Markov chain to parameter estimation, Bayesian methods and estimation of the number of states. In a unified way the book covers both models with finite state spaces and models with continuous state spaces (also called state-space models) requiring approximate simulation-based algorithms that are also described in detail. Many examples illustrate the algorithms and theory. This book builds on recent developments to present a self-contained view.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  P. Prescott,et al.  Monte Carlo Methods , 1964, Computational Statistical Physics.

[3]  Richard A. Levine,et al.  An automated (Markov chain) Monte Carlo EM algorithm , 2004 .

[4]  L. R. Rabiner,et al.  An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.

[5]  J. E. Handschin Monte Carlo techniques for prediction and filtering of non-linear stochastic processes , 1970 .

[6]  A. U.S. Practical Filtering with Sequential Parameter Learning , 2002 .

[7]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .

[8]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[9]  J. Geweke,et al.  Bayesian Inference in Econometric Models Using Monte Carlo Integration , 1989 .

[10]  R. Shumway,et al.  Dynamic linear models with switching , 1991 .

[11]  David Williams,et al.  Probability with Martingales , 1991, Cambridge mathematical textbooks.

[12]  Jun S. Liu,et al.  Sequential importance sampling for nonparametric Bayes models: The next generation , 1999 .

[13]  T. Louis Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .

[14]  K. Lange A gradient algorithm locally equivalent to the EM algorithm , 1995 .

[15]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[16]  Paolo Giudici,et al.  Likelihood‐Ratio Tests for Hidden Markov Models , 2000, Biometrics.

[17]  Prakash Narayan,et al.  Order estimation and sequential universal data compression of a hidden Markov source by the method of mixtures , 1994, IEEE Trans. Inf. Theory.

[18]  Jun S. Liu,et al.  Sequential Imputations and Bayesian Missing Data Problems , 1994 .

[19]  Robert J. Elliott,et al.  New finite-dimensional filters for parameter estimation of discrete-time linear Gaussian models , 1999, IEEE Trans. Autom. Control..

[20]  A. Shiryaev Addendum: On Stochastic Equations in the Theory of Conditional Markov Processes , 1967 .

[21]  Herbert Robbins,et al.  Mixture of Distributions , 1948 .

[22]  R. L. Stratonovich CONDITIONAL MARKOV PROCESSES , 1960 .

[23]  Nicholas G. Polson,et al.  A Monte Carlo Approach to Nonnormal and Nonlinear State-Space Modeling , 1992 .

[24]  R. A. Boyles On the Convergence of the EM Algorithm , 1983 .

[25]  C. Olivier,et al.  Recursive computation of smoothed functionals of hidden Markovian processes using a particle approximation , 2001 .

[26]  K. Athreya,et al.  A New Approach to the Limit Theory of Recurrent Markov Chains , 1978 .

[27]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[28]  M. Pitt,et al.  Likelihood analysis of non-Gaussian measurement time series , 1997 .

[29]  E. Weinstein,et al.  A new method for evaluating the log-likelihood gradient, the Hessian, and the Fisher information matrix for linear dynamic systems , 1989, IEEE Trans. Inf. Theory.

[30]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[31]  R. Z. Khasʹminskiĭ,et al.  Statistical estimation : asymptotic theory , 1981 .

[32]  P. Bickel,et al.  Asymptotic normality of the maximum-likelihood estimator for general hidden Markov models , 1998 .

[33]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[34]  M. Pitt,et al.  Filtering via Simulation: Auxiliary Particle Filters , 1999 .

[35]  Pierre Del Moral,et al.  Feynman-Kac formulae , 2004 .

[36]  Thomas Kaijser A Limit Theorem for Partially Observed Markov Chains , 1975 .

[37]  M. Woodbury A missing information principle: theory and applications , 1972 .

[38]  G. Kitagawa Non-Gaussian State—Space Modeling of Nonstationary Time Series , 1987 .

[39]  Neri Merhav,et al.  Hidden Markov processes , 2002, IEEE Trans. Inf. Theory.

[40]  Shun-ichi Amari,et al.  Identifiability of hidden Markov information sources and their minimum degrees of freedom , 1992, IEEE Trans. Inf. Theory.

[41]  Jun S. Liu,et al.  Monte Carlo EM with importance reweighting and its applications in random effects models 1 1 This wo , 1999 .

[42]  R. Tweedie,et al.  Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms , 1996 .

[43]  Radford M. Neal Markov Chain Monte Carlo Methods Based on `Slicing' the Density Function , 1997 .

[44]  Amir Dembo,et al.  Exact filters for the estimation of the number of transitions of finite-state continuous-time Markov processes , 1988, IEEE Trans. Inf. Theory.

[45]  N. Gordon,et al.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[46]  R. Jennrich,et al.  Acceleration of the EM Algorithm by using Quasi‐Newton Methods , 1997 .

[47]  Neri Merhav,et al.  When is the generalized likelihood ratio test optimal? , 1992, IEEE Trans. Inf. Theory.

[48]  A. Wald Note on the Consistency of the Maximum Likelihood Estimate , 1949 .

[49]  Jitendra Tugnait,et al.  Adaptive estimation and identification for discrete systems with Markov jump parameters , 1981, 1981 20th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[50]  W. D. Ray Hidden Markov and other models for discrete-valued time series , 1997 .

[51]  M. Hassell Capture-recapture methods , 1979, Nature.

[52]  Luc Devroye,et al.  Average time behavior of distributive sorting algorithms , 1981, Computing.

[53]  A. Shiryayev,et al.  Statistics of Random Processes I: General Theory , 1984 .

[54]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[55]  Jun S. Liu,et al.  Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes , 1994 .

[56]  R. Douc,et al.  Asymptotic properties of the maximum likelihood estimator in autoregressive models with Markov regime , 2004, math/0503681.

[57]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[58]  S. Mitter,et al.  Robust Recursive Estimation in the Presence of Heavy-Tailed Observation Noise , 1994 .

[59]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[60]  Arnaud Doucet,et al.  On the use and misuse of particle filtering in digital communications , 2002, 2002 11th European Signal Processing Conference.

[61]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[62]  C. Striebel,et al.  On the maximum likelihood estimates for linear dynamic systems , 1965 .

[63]  Nando de Freitas,et al.  Sequential Monte Carlo Methods in Practice , 2001, Statistics for Engineering and Information Science.

[64]  Branko Ristic,et al.  Beyond the Kalman Filter: Particle Filters for Tracking Applications , 2004 .

[65]  Darrell Whitley,et al.  A genetic algorithm tutorial , 1994, Statistics and Computing.

[66]  S. F. Jarner,et al.  Geometric ergodicity of Metropolis algorithms , 2000 .

[67]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[68]  P. Moral,et al.  On contraction properties of Markov kernels , 2003 .

[69]  Ofer Zeitouni,et al.  On universal hypotheses testing via large deviations , 1991, IEEE Trans. Inf. Theory.

[70]  R. Durrett Probability: Theory and Examples , 1993 .

[71]  Xiao-Li Meng,et al.  Using EM to Obtain Asymptotic Variance-Covariance Matrices: The SEM Algorithm , 1991 .

[72]  A. Budhiraja,et al.  Exponential stability of discrete-time filters for bounded observation noise , 1997 .

[73]  R. Tweedie,et al.  Rates of convergence of the Hastings and Metropolis algorithms , 1996 .

[74]  L. Younes Estimation and annealing for Gibbsian fields , 1988 .

[75]  J. Rosenthal Minorization Conditions and Convergence Rates for Markov Chain Monte Carlo , 1995 .

[76]  Daniel Pierre Loti Viaud Random perturbations of recursive sequences with an application to an epidemic model , 1995, Journal of Applied Probability.

[77]  K. Athreya,et al.  ON THE CONVERGENCE OF THE MARKOV CHAIN SIMULATION METHOD , 1996 .

[78]  Hiromitsu Kumamoto,et al.  Random sampling approach to state estimation in switching environments , 1977, Autom..

[79]  W. Feller On a General Class of "Contagious" Distributions , 1943 .

[80]  David G. Luenberger,et al.  Linear and nonlinear programming , 1984 .

[81]  R. Fletcher Practical Methods of Optimization , 1988 .

[82]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[83]  Aman Ullah,et al.  Asymmetry of Business Cycles: The Markov-Switching Approach , 2002 .

[84]  Iain B. Collings,et al.  A new maximum likelihood gradient algorithm for on-line hidden Markov model identification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[85]  Bart De Moor,et al.  Subspace algorithms for the stochastic identification problem, , 1993, Autom..

[86]  L. Devroye Non-Uniform Random Variate Generation , 1986 .

[87]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[88]  Hans Kiinsch,et al.  State Space and Hidden Markov Models , 2000 .

[89]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[90]  J. Doob Stochastic processes , 1953 .

[91]  T Petrie,et al.  Probabilistic functions of finite-state markov chains. , 1967, Proceedings of the National Academy of Sciences of the United States of America.

[92]  A. Gut Stopped Random Walks , 1987 .

[93]  Nando de Freitas,et al.  The Unscented Particle Filter , 2000, NIPS.

[94]  Jun S. Liu,et al.  Blind Deconvolution via Sequential Imputations , 1995 .

[95]  Hisashi Tanizaki Nonlinear and Non-Gaussian State-Space Modeling with Monte Carlo Techniques : A Survey and Comparative Study , 2000 .

[96]  M. Evans,et al.  Methods for Approximating Integrals in Statistics with Special Emphasis on Bayesian Integration Problems , 1995 .

[97]  Ehud Weinstein,et al.  Iterative and sequential algorithms for multisensor signal enhancement , 1994, IEEE Trans. Signal Process..

[98]  Simon J. Godsill,et al.  On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[99]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[100]  E. M. L. Beale,et al.  Nonlinear Programming: A Unified Approach. , 1970 .

[101]  Louis A. Liporace,et al.  Maximum likelihood estimation for multivariate observations of Markov sources , 1982, IEEE Trans. Inf. Theory.

[102]  G. C. Wei,et al.  A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .

[103]  J. Rosenthal,et al.  General state space Markov chains and MCMC algorithms , 2004, math/0404033.

[104]  P. Moral Feynman-Kac Formulae: Genealogical and Interacting Particle Systems with Applications , 2004 .

[105]  Ing Rj Ser Approximation Theorems of Mathematical Statistics , 1980 .

[106]  Eric Moulines,et al.  Quasi-Newton method for maximum likelihood estimation of hidden Markov models , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[107]  Xiao-Li Meng,et al.  The EM Algorithm—an Old Folk‐song Sung to a Fast New Tune , 1997 .

[108]  Hisashi Tanizaki,et al.  Ch. 22. Nonlinear and non-gaussian state-space modeling with monte carlo techniques: A survey and comparative study , 2003 .

[109]  Harald Niederreiter,et al.  Random number generation and Quasi-Monte Carlo methods , 1992, CBMS-NSF regional conference series in applied mathematics.

[110]  A. Ostrowski Solution of equations and systems of equations , 1967 .

[111]  S. L. Scott Bayesian Methods for Hidden Markov Models , 2002 .

[112]  Luca Tardella,et al.  A geometric approach to transdimensional markov chain monte carlo , 2003 .

[113]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[114]  Andrew L. Rukhin,et al.  Tools for statistical inference , 1991 .

[115]  Rong Chen,et al.  A Theoretical Framework for Sequential Importance Sampling with Resampling , 2001, Sequential Monte Carlo Methods in Practice.

[116]  Jean Jacod,et al.  Interacting Particle Filtering with Discrete-Time Observations: Asymptotic Behaviour in the Gaussian Case , 2001 .

[117]  B. Anderson,et al.  Optimal Filtering , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[118]  W. Wonham Some applications of stochastic difierential equations to optimal nonlinear ltering , 1964 .

[119]  Jun S. Liu,et al.  Metropolized independent sampling with comparisons to rejection sampling and importance sampling , 1996, Stat. Comput..

[120]  M. Stephens Bayesian analysis of mixture models with an unknown number of components- an alternative to reversible jump methods , 2000 .

[121]  Xiao-Li Meng,et al.  On the rate of convergence of the ECM algorithm , 1994 .

[122]  Christian P. Robert,et al.  The Bayesian choice , 1994 .

[123]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[124]  A. Doucet,et al.  Parameter estimation in general state-space models using particle methods , 2003 .

[125]  J. Rice,et al.  Maximum likelihood estimation and identification directly from single-channel recordings , 1992, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[126]  S. Nielsen The stochastic EM algorithm: estimation and asymptotic results , 2000 .

[127]  David Q. Mayne,et al.  A solution of the smoothing problem for linear dynamic systems , 1966, Autom..

[128]  Neri Merhav,et al.  Estimating the number of states of a finite-state source , 1992, IEEE Trans. Inf. Theory.

[129]  Y. Ho,et al.  A Bayesian approach to problems in stochastic estimation and control , 1964 .

[130]  Siem Jan Koopman,et al.  Estimation of stochastic volatility models via Monte Carlo maximum likelihood , 1998 .

[131]  R. L. Tweedie,et al.  Explicit Rates of Convergence of Stochastically Ordered Markov Chains , 1996 .

[132]  B. Leroux Maximum-likelihood estimation for hidden Markov models , 1992 .

[133]  E. Nummelin,et al.  A splitting technique for Harris recurrent Markov chains , 1978 .

[134]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[135]  H. Derin,et al.  A recursive algorithm for the Bayes solution of the smoothing problem , 1981 .

[136]  Donald L. Iglehart,et al.  Importance sampling for stochastic simulations , 1989 .

[137]  F. Gland,et al.  STABILITY AND UNIFORM APPROXIMATION OF NONLINEAR FILTERS USING THE HILBERT METRIC AND APPLICATION TO PARTICLE FILTERS1 , 2004 .

[138]  D. Mayne,et al.  Monte Carlo techniques to estimate the conditional expectation in multi-stage non-linear filtering† , 1969 .

[139]  R. Mehra,et al.  Computational aspects of maximum likelihood estimation and reduction in sensitivity function calculations , 1974 .

[140]  B. Jamison,et al.  Contributions to Doeblin's theory of Markov processes , 1967 .

[141]  Steve Young,et al.  A review of large-vocabulary continuous-speech recognition , 1996 .

[142]  Hisashi Tanizaki,et al.  Nonlinear and non-Gaussian state-space modeling with Monte Carlo simulations , 1998 .

[143]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[144]  J. Hammersley,et al.  Monte Carlo Methods , 1965 .