Systematics of Short-range Correlations in Eukaryotic Genomes

Attempts to identify a species on the basis of its DNA sequence on purely statistical grounds have been formulated for more than a decade. Solving this problem could have a huge impact on understanding processes of genome evolution and on the design of classification schemes for DNA sequences.

[1]  Joel E. Cohen,et al.  Mathematics Is Biology's Next Microscope, Only Better; Biology Is Mathematics' Next Physics, Only Better , 2004, PLoS biology.

[2]  S Karlin,et al.  Compositional differences within and between eukaryotic genomes. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Peter A. W. Lewis,et al.  STATIONARY DISCRETE AUTOREGRESSIVE‐MOVING AVERAGE TIME SERIES GENERATED BY MIXTURES , 1983 .

[4]  C. Peng,et al.  Long-range correlations in nucleotide sequences , 1992, Nature.

[5]  S Karlin,et al.  Genome-scale compositional comparisons in eukaryotes. , 2001, Genome research.

[6]  Marc-Thorsten Hütt,et al.  Genome Phylogeny Based on Short-Range Correlations in DNA Sequences , 2005, J. Comput. Biol..

[7]  S. Carroll,et al.  Genome-scale approaches to resolving incongruence in molecular phylogenies , 2003, Nature.

[8]  J. Qi,et al.  Whole Proteome Prokaryote Phylogeny Without Sequence Alignment: A K-String Composition Approach , 2003, Journal of Molecular Evolution.

[9]  Sophie Schbath,et al.  An Efficient Statistic to Detect Over-and Under-Represented Words in DNA Sequences , 1997, J. Comput. Biol..

[10]  N. Goldman,et al.  Nucleotide, dinucleotide and trinucleotide frequencies explain patterns observed in chaos game representations of DNA sequences. , 1993, Nucleic acids research.

[11]  Ivo Grosse,et al.  Repeats and correlations in human DNA sequences. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  E. Trifonov 3-, 10.5-, 200- and 400-base periodicities in genome sequences , 1998 .

[13]  Marc-Thorsten Hütt,et al.  Information theory reveals large-scale synchronisation of statistical correlations in eukaryote genomes. , 2005, Gene.

[14]  Wentian Li,et al.  Long-range correlation and partial 1/fα spectrum in a noncoding DNA sequence , 1992 .

[15]  S. Buldyrev,et al.  Species independence of mutual information in coding and noncoding DNA. , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[16]  M. Blaser,et al.  Evolutionary implications of microbial genome tetranucleotide frequency biases. , 2003, Genome research.

[17]  I. Rigoutsos,et al.  Accurate phylogenetic classification of variable-length DNA fragments , 2007, Nature Methods.

[18]  R. Amann,et al.  Application of tetranucleotide frequencies for the assignment of genomic fragments. , 2004, Environmental microbiology.

[19]  Marc-Thorsten Hütt,et al.  Informational structure of two closely related eukaryotic genomes. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  E. Trifonov,et al.  The pitch of chromatin DNA is reflected in its nucleotide sequence. , 1980, Proceedings of the National Academy of Sciences of the United States of America.