On the Characterization of DNA Primary Sequences by Triplet of Nucleic Acid Bases

We consider construction of a set of smaller 4 x 4 matrices to represent DNA primary sequences which are based on enumeration of all 64 triplets of nucleic acids bases. The leading eigenvalue from the constructed matrices has been selected as an invariant for construction of a vector to characterize DNA. Additional invariants considered of the derived condensed matrices of DNA include a 64-component vector, the components of which consist of ordered triplets XYZ, with X, Y, Z = A, C, G, T. Construction of similarity/dissimilarity tables based on different invariants for a set of sequences of DNA belonging to the first exon of the beta-globin gene of eight species illustrates the utility of newly formulated invariants for DNA.