Motif Discovery in Heterogeneous Sequence Data

This paper introduces the first integrated algorithm designed to discover novel motifs in heterogeneous sequence data, which is comprised of coregulated genes from a single genome together with the orthologs of these genes from other genomes. Results are presented for regulons in yeasts, worms, and mammals.

[1]  I. Jonassen,et al.  Predicting gene regulatory elements in silico on a genomic scale. , 1998, Genome research.

[2]  J. Thomas,et al.  The RFX-type transcription factor DAF-19 regulates sensory neuron cilium formation in C. elegans. , 2000, Molecular cell.

[3]  Holger Karas,et al.  TRANSFAC: a database on transcription factors and their DNA binding sites , 1996, Nucleic Acids Res..

[4]  G. Church,et al.  Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. , 2000, Journal of molecular biology.

[5]  Charles Elkan,et al.  The Value of Prior Knowledge in Discovering Motifs with MEME , 1995, ISMB.

[6]  Gary D. Stormo,et al.  Identifying DNA and protein patterns with statistically significant alignments of multiple sequences , 1999, Bioinform..

[7]  L. Pachter,et al.  rVista for comparative sequence-based discovery of functional transcription factor binding sites. , 2002, Genome research.

[8]  Jon D. McAuliffe,et al.  Phylogenetic Shadowing of Primate Sequences to Find Functional Regions of the Human Genome , 2003, Science.

[9]  A. A. Reilly,et al.  An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences , 1990, Proteins.

[10]  C. Lawrence,et al.  Human-mouse genome comparisons to locate regulatory sites , 2000, Nature Genetics.

[11]  L. Fulton,et al.  Finding Functional Features in Saccharomyces Genomes by Phylogenetic Footprinting , 2003, Science.

[12]  M. Goodman,et al.  Embryonic ε and γ globin genes of a prosimian primate (Galago crassicaudatus): Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprints , 1988 .

[13]  E. Koonin,et al.  Prediction of transcription regulatory sites in Archaea by a comparative genomic approach. , 2000, Nucleic acids research.

[14]  Ting Wang,et al.  Combining phylogenetic data with co-regulated genes to identify regulatory motifs , 2003, Bioinform..

[15]  Michael Q. Zhang,et al.  SCPD: a promoter database of the yeast Saccharomyces cerevisiae , 1999, Bioinform..

[16]  G. Stormo,et al.  ANN-Spec: a method for discovering transcription factor binding sites with improved specificity. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[17]  M. Blanchette,et al.  Discovery of regulatory elements by a computational method for phylogenetic footprinting. , 2002, Genome research.

[18]  Jeremy Buhler,et al.  Finding motifs using random projections , 2001, RECOMB.

[19]  Aris Floratos,et al.  Motif discovery without alignment or enumeration (extended abstract) , 1998, RECOMB '98.

[20]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[21]  Benno Schwikowski,et al.  Algorithms for Phylogenetic Footprinting , 2002, J. Comput. Biol..

[22]  B. Birren,et al.  Sequencing and comparison of yeast species to identify genes and regulatory elements , 2003, Nature.

[23]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[24]  B. Trask,et al.  Genomic analysis of orthologous mouse and human olfactory receptor loci , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[25]  G. Church,et al.  Conservation of DNA regulatory motifs and discovery of new motifs in microbial genomes. , 2000, Genome research.

[26]  Martin Tompa,et al.  An algorithm for finding novel gapped motifs in DNA sequences , 1998, RECOMB '98.

[27]  Saurabh Sinha,et al.  Performance comparison of algorithms for finding transcription factor binding sites , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[28]  M. Tompa,et al.  Discovery of novel transcription factor binding sites by statistical overrepresentation. , 2002, Nucleic acids research.

[29]  G. Stormo,et al.  Identification of a novel cis-regulatory element involved in the heat shock response in Caenorhabditis elegans using microarray gene expression and computational methods. , 2002, Genome research.

[30]  J. Collado-Vides,et al.  Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. , 2000, Nucleic acids research.