Handling updates of a biological sequence based on Hidden Markov Models

The computational complexity of evaluating homologies between a gene sequence and profile Hidden Markov Models (HMMs) is relatively high. Unfortunately, researchers must re-evaluate matches every time they discover an error in a sequence or encounter a mutation of the sequence. Since these occurrences are frequent, it is desirable to have a low complexity procedure for updating the matching result when a small perturbation in a given input gene sequence is observed. In this paper, we describe such a procedure based on a sensitivity analysis of the Viterbi algorithm used to evaluate the similarity of an unknown gene sequence and a profile HMMs. By extending single arc tolerance bounds to the evaluation of the relative change in all nodes' distances from a root node, our algorithm skips all unperturbed parts of a sequence. As a result, our proposed algorithm can update the matching decision in only 20% of the time required by the current approach that computes a new match with the perturbed sequence and base HMM model.