Investigation of coding structure in DNA

We have all heard the term "cracking the genomic code", but is DNA a code in the information theoretic sense? The coined term "genetic code" maps nucleotide triplets (codons) to amino acids. However, this is in a computer coding sense because a codon instruction is performed to output an amino acid sequence. We examine methods to detect redundant coding structures in DNA. First, a finite field framework for a nucleotide symbolic sequence is presented; then approaches to finding the sequence structure associated with error correcting codes are examined. We compare a previously proposed parity-check vector search method to a novel subspace partitioning algorithm. The subspace partitioning algorithm is a general approach to finding any linear coding redundancy. Our method provides an easy way of visualizing coding potential in DNA sequences as shown from the test data.

[1]  Dimitris Anastassiou,et al.  Genomic signal processing , 2001, IEEE Signal Process. Mag..

[2]  Dónall A Mac Dónaill A parity code interpretation of nucleotide alphabet composition. , 2002, Chemical communications.

[3]  J. Feingold [Genes and the environment]. , 2000, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[4]  M. A. Vouk,et al.  A CODING THEORY FRAMEWORK FOR GENETIC SEQUENCE ANALYSIS , 2002 .

[5]  Lila L. Gatlin,et al.  Information theory and the living system , 1972 .

[6]  Mac Dónaill Da A parity code interpretation of nucleotide alphabet composition. , 2002 .

[7]  Wei Wang,et al.  Computing linear transforms of symbolic signals , 2002, IEEE Trans. Signal Process..

[8]  L S Liebovitch,et al.  Is there an error correcting code in the base sequence in DNA? , 1996, Biophysical journal.