On the Similarity of DNA Primary Sequences

We consider numerical characterization of graphical representations of DNA primary sequences. In particular we consider graphical representation of DNA of beta-globins of several species, including human, on the basis of the approach of A. Nandy in which nucleic bases are associated with a walk over integral points of a Cartesian x, y-coordinate system. With a so-generated graphical representation of DNA, we associate a distance/distance matrix, the elements of which are given by the quotient of the Euclidean and the graph theoretical distances, that is, through the space and through the bond distances for pairs of bases of graphical representation of DNA. We use eigenvalues of so-constructed matrices to characterize individual DNA sequences. The eigenvalues are used to construct numerical sequences, which are subsequently used for similarity/dissimilarity analysis. The results of such analysis have been compared and combined with similarity tables based on the frequency of occurrence of pairs of bases.

[1]  A. Nandy,et al.  A new graphical representation and analysis of DNA sequence structure. I: Methodology and application to globin genes , 1994 .

[2]  Peter Zinn,et al.  List Operations on Chemical Graphs, 6. Comparative Study of Combinatorial Topological Indexes of the Hosoya Type , 1995, J. Chem. Inf. Comput. Sci..

[3]  Milan Randic,et al.  On Characterization of Molecular Shapes , 1995, J. Chem. Inf. Comput. Sci..

[4]  M. Randic,et al.  Comparison of sequences as a method for evaluation of the molecular similarity , 1986, Journal of computational chemistry.

[5]  Milan Randić,et al.  On characterization of the conformations of nine‐membered rings , 1995 .

[6]  A. Nandy GRAPHICAL ANALYSIS OF DNA SEQUENCE STRUCTURE : III. INDICATIONS OF EVOLUTIONARY DISTINCTIONS AND CHARACTERISTICS OF INTRONS AND EXONS , 1996 .

[7]  Goran Krilov,et al.  Characterization of 3-D sequences of proteins☆ , 1997 .

[8]  Milan Randic,et al.  Distance/Distance Matrixes , 1994, J. Chem. Inf. Comput. Sci..

[9]  E Hamori Graphic representation of long DNA sequences by the method of H curves--current results and future aspects. , 1989, BioTechniques.

[10]  Milan Randic,et al.  On Characterization of Cyclic Structures , 1997, J. Chem. Inf. Comput. Sci..

[11]  Nenad Trinajstić,et al.  Molecular graphs having identical spectra , 1976 .

[12]  Nenad Trinajstić,et al.  On conjugated molecules with identical topological spectra , 1975 .

[13]  Milan Randić,et al.  On characterization of DNA primary sequences by a condensed matrix , 2000 .

[14]  Frank Harary,et al.  Cospectral Graphs and Digraphs , 1971 .

[15]  L. Lovász,et al.  On the eigenvalues of trees , 1973 .

[16]  Nenad Trinajstić,et al.  On irreducible endospectral graphs , 1986 .

[17]  M. L. Ellzey,et al.  Isospectral graphs and molecules , 1975 .

[18]  J. Kruskal An Overview of Sequence Comparison: Time Warps, String Edits, and Macromolecules , 1983 .

[19]  L. Collatz,et al.  Spektren endlicher grafen , 1957 .

[20]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[21]  Milan Randić On structural ordering and branching of acyclic saturated hydrocarbons , 1998 .

[22]  M. Randic Novel molecular descriptor for structure—property studies , 1993 .

[23]  H. Hosoya Topological Index. A Newly Proposed Quantity Characterizing the Topological Nature of Structural Isomers of Saturated Hydrocarbons , 1971 .

[24]  Milan Randić Hosoya Matrix - A Source of New Molecular Descriptors , 1994 .

[25]  H. Coxeter Self-dual configurations and regular graphs , 1950 .

[26]  M. Fisher On hearing the shape of a drum , 1966 .

[27]  Goran Krilov,et al.  ON A CHARACTERIZATION OF THE FOLDING OF PROTEINS , 1999 .

[28]  Goran Krilov,et al.  On characterization of molecular surfaces , 1997 .

[29]  Milan Randic Condensed Representation of DNA Primary Sequences , 2000, J. Chem. Inf. Comput. Sci..

[30]  Borka Jerman-Blažič,et al.  Evaluation of the molecular similarity and property prediction for QSAR purposes , 1989 .

[31]  H. Wiener Structural determination of paraffin boiling points. , 1947, Journal of the American Chemical Society.

[32]  M. Randic,et al.  MOLECULAR PROFILES NOVEL GEOMETRY-DEPENDENT MOLECULAR DESCRIPTORS , 1995 .

[33]  Milan Randić,et al.  Molecular bonding profiles , 1996 .

[34]  Luz M. DeAlba,et al.  Dense Graphs and Sparse Matrices , 1997, J. Chem. Inf. Comput. Sci..

[35]  George A. Baker,et al.  Drum Shapes and Isospectral Graphs , 1966 .

[36]  Frank Harary,et al.  The Characteristic Polyomial Does Not Uniquely Determine the Topology of a Molecule , 1971 .

[37]  Milan Randic,et al.  Wiener Matrix Invariants , 1994, Journal of chemical information and computer sciences.