An Algorithm for the Distance Between Two Finite Sequences

In a biological problem, which will be described later, it is necessary to compute the distance or degree of dissimilarity between two finite sequences. A mathematical definition of this distance was brought to my attention by S. M. Ulam, and an algorithm for computing it will be presented here. If m and IZ are the lengths of the two sequences and m < n, then the number of computational steps in the algorithm is m%, where each step consists of selecting the largest of three known numbers. In Section 2 it will be shown how the algorithm can be changed to compute the modifications of this distance which are required in the biological context.

[1]  W. Fitch,et al.  Construction of phylogenetic trees. , 1967, Science.

[2]  D Sankoff,et al.  Matching sequences under deletion-insertion constraints. , 1972, Proceedings of the National Academy of Sciences of the United States of America.