hsmm - An R package for analyzing hidden semi-Markov models

Hidden semi-Markov models are a generalization of the well-known hidden Markov model. They allow for a greater flexibility of sojourn time distributions, which implicitly follow a geometric distribution in the case of a hidden Markov chain. The aim of this paper is to describe hsmm, a new software package for the statistical computing environment R. This package allows for the simulation and maximum likelihood estimation of hidden semi-Markov models. The implemented Expectation Maximization algorithm assumes that the time spent in the last visited state is subject to right-censoring. It is therefore not subject to the common limitation that the last visited state terminates at the last observation. Additionally, hsmm permits the user to make inferences about the underlying state sequence via the Viterbi algorithm and smoothing probabilities.

[1]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[2]  M. Borodovsky,et al.  GeneMark.hmm: new solutions for gene finding. , 1998, Nucleic acids research.

[3]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[4]  Jan Bulla,et al.  Stylized facts of financial time series and hidden semi-Markov models , 2006, Comput. Stat. Data Anal..

[5]  Stephen E. Levinson,et al.  Continuously variable duration hidden Markov models for automatic speech recognition , 1986 .

[6]  Jan Bulla,et al.  Computational issues in parameter estimation for stationary hidden Markov models , 2008, Comput. Stat..

[7]  R. Hathaway A constrained EM algorithm for univariate normal mixtures , 1986 .

[8]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[9]  J. Sansom,et al.  Fitting hidden semi-Markov models to breakpoint rainfall data , 2001, Journal of Applied Probability.

[10]  Jin H. Kim,et al.  Nonstationary hidden Markov model , 1995, Signal Process..

[11]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[12]  Haikady N. Nagaraja,et al.  Inference in Hidden Markov Models , 2006, Technometrics.

[13]  Lain L. MacDonald,et al.  Hidden Markov and Other Models for Discrete- valued Time Series , 1997 .

[14]  Y. Guédon Estimating Hidden Semi-Markov Chains From Discrete Sequences , 2003 .

[15]  Gerda Claeskens,et al.  Assessing the fit of a model , 2002 .

[16]  Neri Merhav,et al.  Hidden Markov processes , 2002, IEEE Trans. Inf. Theory.

[17]  Jan Bulla,et al.  Application of Hidden Markov Models and Hidden Semi-Markov Models to Financial Time Series , 2006 .

[18]  Y. Guédon,et al.  Pattern analysis in branching and axillary flowering sequences. , 2001, Journal of theoretical biology.

[19]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[20]  H. Kobayashi,et al.  An efficient forward-backward algorithm for an explicit-duration hidden Markov model , 2003, IEEE Signal Processing Letters.

[21]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[22]  Douglas L. Brutlag,et al.  Bayesian Segmentation of Protein Secondary Structure , 2000, J. Comput. Biol..