An Introduction to the Hidden Markov Models for Bioinformatics

The Hidden Markov Model (HMM) is a statistical model, which is very well suited for many tasks in molecular biology, although they have been mostly developed for speech recognition since the early 1970's. The most popular use of the HMM in molecular biology is as a "probabilistic pro-file" of a protein family, which is called a profile HMM. From a family of proteins (or DNA) a profile HMM can be made for searching a database for other members of the family. The HMM can be applied to other types of problems. It is particularly well suited for problems with a simple "grammatical structure", such as gene finding.