Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals

Comprehensive identification of all functional elements encoded in the human genome is a fundamental need in biomedical research. Here, we present a comparative analysis of the human, mouse, rat and dog genomes to create a systematic catalogue of common regulatory motifs in promoters and 3′ untranslated regions (3′ UTRs). The promoter analysis yields 174 candidate motifs, including most previously known transcription-factor binding sites and 105 new motifs. The 3′-UTR analysis yields 106 motifs likely to be involved in post-transcriptional regulation. Nearly one-half are associated with microRNAs (miRNAs), leading to the discovery of many new miRNA genes and their likely target genes. Our results suggest that previous estimates of the number of human miRNA genes were low, and that miRNAs regulate at least 20% of human genes. The overall results provide a systematic view of gene regulation in the human, which will be refined as additional mammalian genomes become available.

[1]  T A Gray,et al.  Phylogenetic footprinting reveals a nuclear protein which binds to silencer sequences in the human gamma and epsilon globin genes , 1992, Molecular and cellular biology.

[2]  Weinberger,et al.  RNA folding and combinatory landscapes. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[3]  C. Y. Chen,et al.  AU-rich elements: characterization and importance in mRNA degradation. , 1995, Trends in biochemical sciences.

[4]  R. Kraus,et al.  Estrogen-related receptor alpha 1 functionally binds as a monomer to extended half-site sequences including ones contained within estrogen-response elements. , 1997, Molecular endocrinology.

[5]  Donna R. Maglott,et al.  NCBI's LocusLink and RefSeq , 2000, Nucleic Acids Res..

[6]  C. Lawrence,et al.  Human-mouse genome comparisons to locate regulatory sites , 2000, Nature Genetics.

[7]  I-Min A. Dubchak,et al.  Active conservation of noncoding sequences revealed by three-way species comparisons. , 2000, Genome research.

[8]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[9]  Jonathan C. Cohen,et al.  An Apolipoprotein Influencing Triglycerides in Humans and Mice Revealed by Comparative Sequencing , 2001, Science.

[10]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[11]  E. Lai Micro RNAs are complementary to 3′ UTR sequence motifs that mediate negative post-transcriptional regulation , 2002, Nature Genetics.

[12]  Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome , 2002, Nature.

[13]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[14]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[15]  L. Fulton,et al.  Finding Functional Features in Saccharomyces Genomes by Phylogenetic Footprinting , 2003, Science.

[16]  D. Haussler,et al.  Human-mouse alignments with BLASTZ. , 2003, Genome research.

[17]  Jon D. McAuliffe,et al.  Phylogenetic Shadowing of Primate Sequences to Find Functional Regions of the Human Genome , 2003, Science.

[18]  Roland Jurecic,et al.  The PUF Family of RNA‐binding Proteins: Does Evolutionarily Conserved Structure Equal Conserved Function? , 2003, IUBMB life.

[19]  C. Burge,et al.  Vertebrate MicroRNA Genes , 2003, Science.

[20]  S. Kuersten,et al.  The power of the 3′ UTR: translational control and development , 2003, Nature Reviews Genetics.

[21]  B. Birren,et al.  Sequencing and comparison of yeast species to identify genes and regulatory elements , 2003, Nature.

[22]  C. Burge,et al.  Prediction of Mammalian MicroRNA Targets , 2003, Cell.

[23]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[24]  G. Ruvkun,et al.  A uniform system for microRNA annotation. , 2003, RNA.

[25]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[26]  Xiaohui S. Xie,et al.  Errα and Gabpa/b specify PGC-1α-dependent oxidative phosphorylation gene expression that is altered in diabetic muscle , 2004 .

[27]  E. Birney,et al.  Comparison of human chromosome 21 conserved nongenic sequences (CNGs) with the mouse and dog genomes shows that their selective constraint is independent of their genic environment. , 2004, Genome research.

[28]  D. Haussler,et al.  Ultraconserved Elements in the Human Genome , 2004, Science.

[29]  Sam Griffiths-Jones,et al.  The microRNA Registry , 2004, Nucleic Acids Res..

[30]  Lisa M. D'Souza,et al.  Genome sequence of the Brown Norway rat yields insights into mammalian evolution , 2004, Nature.

[31]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[32]  D. Haussler,et al.  Aligning multiple genomic sequences with the threaded blockset aligner. , 2004, Genome research.

[33]  Wyeth W. Wasserman,et al.  JASPAR: an open-access database for eukaryotic transcription factor binding profiles , 2004, Nucleic Acids Res..

[34]  Wyeth W. Wasserman,et al.  ConSite: web-based prediction of regulatory elements using cross-species comparison , 2004, Nucleic Acids Res..

[35]  Mathieu Blanchette,et al.  PhyME: A probabilistic algorithm for finding motifs in sets of orthologous sequences , 2004, BMC Bioinformatics.

[36]  Eugene Berezikov,et al.  Phylogenetic Shadowing and Computational Identification of Human microRNA Genes , 2005, Cell.

[37]  S. Eddy A Model of the Statistical Power of Comparative Genome Sequence Analysis , 2005, PLoS biology.

[38]  J. Castle,et al.  Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs , 2005, Nature.

[39]  C. Burge,et al.  Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets , 2005, Cell.

[40]  Jean L. Chang,et al.  An initial strategy for the systematic identification of functional elements in the human genome by low-redundancy comparative sequencing. , 2005, Proceedings of the National Academy of Sciences of the United States of America.