The computational complexity of evaluating homologies between a gene sequence and profile Hidden Markov Models (HMMs) is relatively high. Unfortunately, researchers must re-evaluate matches every time they discover an error in a sequence or encounter a mutation of the sequence. Since these occurrences are frequent, it is desirable to have a low complexity procedure for updating the matching result when a small perturbation in a given input gene sequence is observed. In this paper, we describe such a procedure based on a sensitivity analysis of the Viterbi algorithm used to evaluate the similarity of an unknown gene sequence and a profile HMMs. By extending single arc tolerance bounds to the evaluation of the relative change in all nodes' distances from a root node, our algorithm skips all unperturbed parts of a sequence. As a result, our proposed algorithm can update the matching decision in only 20% of the time required by the current approach that computes a new match with the perturbed sequence and base HMM model.
[1]
R. Schwartz,et al.
The N-best algorithms: an efficient and exact procedure for finding the N most likely sentence hypotheses
,
1990,
International Conference on Acoustics, Speech, and Signal Processing.
[2]
Douglas R. Shier,et al.
Arc tolerances in shortest path and network flow problems
,
1980,
Networks.
[3]
Sean R. Eddy,et al.
Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
,
1998
.
[4]
Pierre Baldi,et al.
Smooth On-Line Learning Algorithms for Hidden Markov Models
,
1994,
Neural Computation.
[5]
Nils J. Nilsson,et al.
Problem-solving methods in artificial intelligence
,
1971,
McGraw-Hill computer science series.