Gibbs, A. J., Dale, M. B., Kinns, H. R., and MacKenzie, H. G. (Australian National University and Commonwealth Scientific and Industrial Research Organization, Canberra, Australia) 1971. The transition matrix method for comparing sequences; its use in describing and claesifying proteins by their amino acid sequences. Syst. Zool., 20:417-425. Two hundred and sixteen proteins were classified by comparing the frequency with which different amino acid doublets (nearest neighbour pairs) occurred in their amino acid sequences. The proteins classified into the usually recognised groups (e.g., the fibrinopeptides A & B, the insulins, the haemoglobins, etc.). The relationships within these groups mostly coincided with the assumed relationships of the organisms from which the proteins were obtained. The classification also suggested relationships between some proteins not normally thought to be related. [Numerical phenetics, protein primary structure comparisons] There are increasing numbers of reports of the sequence of amino acids in different proteins. Comparisons of the sequences are of interest not only to biochemists, but also taxonomists, for it has been found that the relatedness of the amino acid sequences of homologous proteins from different organisms is usually correlated with the relatedness (estimated in conventional ways) of the organisms from which they were obtained. So far most comparisons have been made between proteins which are likely to be related such as, for example, the haemoglobins. However, similarity of sequence (sequence homology) has also been shown between proteins that are not obviously related such as hens' egg lysozyme and bovine alpha lactalbumen (Brew, Vanaman and Hill, 1967). Thus it would perhaps be worthwhile to compare the sequences of all known proteins, and this can only be done conveniently with the help of a computer. Various methods have been used for comparing and classifying sequences. Perhaps the simplest method is to classify them on their composition, for if two sequences are identical they must have the same composition. Different sequences may have the same composition, though if their composition is different they cannot possibly have the same sequence. Another method is to align each pair of sequences, allowing gaps if necessary, and then to count the number of differences between the sequences and use this as a me sure of similarity for classification. The great disadvantage of this method is that comparisons are only possible between proteins that are fairly easily aligned. At least three methods may be used to overcome the difficulty of aligning sequences. The sequences may be compared by a sliding match method (Sackin, Sneath and Merriam, 1966) in which one sequence (or a part of it) is 'moved' along the other so that each amino acid in one sequence comes, in turn, alongside each amino acid in the other sequence, and, after each move, the number of matched amino acids in the two sequences is recorded. Alternatively, they may be compared by the 'diagram' method of Gibbs and MacIntyre (1970), which is, in essence, a simple way of making a complete 'sliding match' comparison of two sequences. Another alternative is to compare the sequences by the frequency with which runs of different amino acids
[1]
Rall Sc,et al.
The amino acid sequence of ferredoxin from Clostridium acidi-urici.
,
1969
.
[2]
E. Margoliash,et al.
Comparative aspects of primary structures of proteins.
,
1968,
Annual review of biochemistry.
[3]
G. N. Lance,et al.
A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems
,
1967,
Comput. J..
[4]
A. Gibbs,et al.
The Diagram, a Method for Comparing Sequences
,
1970
.
[5]
K. Sletten,et al.
Cytochrome c2 of Rhodospirillum rubrum. 1. Molecular properties of the protein and amino acid sequences of its peptides derived by the action of trypsin and thermolysin.
,
1968,
The Journal of biological chemistry.
[6]
G. N. Lance,et al.
Numerical Classification Of Sequences
,
1970,
Aust. Comput. J..
[7]
E. Margoliash,et al.
Primary structure of alfalfa ferredoxin.
,
1969,
The Journal of biological chemistry.
[8]
R. J. Meadway,et al.
Chemical Structure of Bacterial Penicillinases
,
1969,
Nature.
[9]
C. A. Leone.
Chemotaxonomy and Serotaxonomy
,
1969
.
[10]
K. Yasunobu,et al.
Non-heme iron proteins. X. The amino acid sequences of ferredoxins from Leucaena glauca.
,
1969,
The Journal of biological chemistry.
[11]
J. Clegg,et al.
Coincidence and protein structure.
,
1961,
Journal of molecular biology.
[12]
W. Fitch.
An improved method of testing for evolutionary homology.
,
1966,
Journal of molecular biology.