Hidden Markov Models

Hidden Markov models find their use in categorizing sequences of data, in our case, DNA.[2, 1] The idea behind them is simple: a HMM is a model for generating a data sequence by following a stochastic procedure. The model contains a finite, usually small number of different states; the sequence is generated by moving from state to state and at each state, producing a piece of data. In a regular (not hidden) Markov Model, the data produced at each state is predetermined (for example, you have states for the bases A, T, G, and C). The history of states is given explicitly in the data. See figure 1 for a diagram of a regular Markov model. In a HMM, the history of states the model took cannot generally be determined from the data sequence. Rabiner and Durbin et. al.[2, 1] use notation similar to the following. If there are N states, then each state is represented by Si, where i = 1...N . The probability of moving from state Si to state Sj is given by the matrix element aij. The probability of producing or emitting the data Ok in a state Si is ei(Ok). In our example, Ok ∈ {A, T,G, C}. See figure 2 for a diagram of a three-state HMM. Let O = O1O2O3...OT be a sequence of T observations (data) and let Q = q1q2q3...qT be the sequence of T states the model went through to produce those observations. The values πi are the probabilities of starting in the states Si.

[1]  Yangsheng Xu,et al.  Online, interactive learning of gestures for human/robot interfaces , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[2]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[3]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interaction , 1999, ICVS.

[4]  Robert D. Nowak,et al.  Wavelet-based statistical signal processing using hidden Markov models , 1998, IEEE Trans. Signal Process..

[5]  Mark Borodovsky,et al.  GENMARK: Parallel Gene Recognition for Both DNA Strands , 1993, Comput. Chem..

[6]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[7]  Alex Pentland,et al.  Graphical models for driver behavior recognition in a SmartCar , 2000, Proceedings of the IEEE Intelligent Vehicles Symposium 2000 (Cat. No.00TH8511).

[8]  Alex Pentland,et al.  Coupled hidden Markov models for complex action recognition , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Alex Pentland,et al.  Driver behavior recognition and prediction in a SmartCar , 2000, Defense, Security, and Sensing.