REPuter: the manifold applications of repeat analysis on a genomic scale.

The repetitive structure of genomic DNA holds many secrets to be discovered. A systematic study of repetitive DNA on a genomic or inter-genomic scale requires extensive algorithmic support. The REPuter program described herein was designed to serve as a fundamental tool in such studies. Efficient and complete detection of various types of repeats is provided together with an evaluation of significance and interactive visualization. This article circumscribes the wide scope of repeat analysis using applications in five different areas of sequence analysis: checking fragment assemblies, searching for low copy repeats, finding unique sequences, comparing gene structures and mapping of cDNA/EST sequences.

[1]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[2]  Robert E. Tarjan,et al.  Fast Algorithms for Finding Nearest Common Ancestors , 1984, SIAM J. Comput..

[3]  Esko Ukkonen,et al.  Algorithms for Approximate String Matching , 1985, Inf. Control..

[4]  Uzi Vishkin,et al.  On Finding Lowest Common Ancestors: Simplification and Parallelization , 1988, AWOC.

[5]  A. Kerlavage,et al.  Complementary DNA sequencing: expressed sequence tags and human genome project , 1991, Science.

[6]  R. Durbin,et al.  A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. , 1995, Gene.

[7]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[8]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[9]  N. W. Davis,et al.  The complete genome sequence of Escherichia coli K-12. , 1997, Science.

[10]  Eugene W. Myers,et al.  Estimating the Probability of Approximate Matches , 1997, CPM.

[11]  Carton W. Chen,et al.  The telomeres of Streptomyces chromosomes contain conserved palindromic sequences with potential to form complex secondary structures , 1998, Molecular microbiology.

[12]  A brief guide to phylogenetic software. , 1998, Trends in genetics : TIG.

[13]  Alex van Belkum,et al.  Short-Sequence DNA Repeats in Prokaryotic Genomes , 1998, Microbiology and Molecular Biology Reviews.

[14]  S Holloway,et al.  A chromosomal deletion map of human malformations. , 1998, American journal of human genetics.

[15]  B. Morrow,et al.  Low-copy repeats mediate the common 3-Mb deletion in patients with velo-cardio-facial syndrome. , 1999, American journal of human genetics.

[16]  Stefan Kurtz,et al.  Reducing the space requirement of suffix trees , 1999, Softw. Pract. Exp..

[17]  S. Salzberg,et al.  Alignment of whole genomes. , 1999, Nucleic acids research.

[18]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[19]  Stefan Kurtz,et al.  REPuter: fast computation of maximal repeats in complete genomes , 1999, Bioinform..

[20]  Karl Popper,et al.  The REPRO server : finding protein internal sequence repeats through the Web , 2000 .

[21]  D. Womble,et al.  GCG: The Wisconsin Package of sequence analysis programs. , 2000, Methods in molecular biology.

[22]  Enno Ohlebusch,et al.  Computation and Visualization of Degenerate Repeats in Complete Genomes , 2000, ISMB.

[23]  B. Roe,et al.  Chromosome 22-specific low copy repeats and the 22q11.2 deletion syndrome: genomic organization and deletion endpoint analysis. , 2000, Human molecular genetics.

[24]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[25]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.