SwissRegulon, a database of genome-wide annotations of regulatory sites: recent updates

Identification of genomic regulatory elements is essential for understanding the dynamics of cellular processes. This task has been substantially facilitated by the availability of genome sequences for many species and high-throughput data of transcripts and transcription factor (TF) binding. However, rigorous computational methods are necessary to derive accurate genome-wide annotations of regulatory sites from such data. SwissRegulon (http://swissregulon.unibas.ch) is a database containing genome-wide annotations of regulatory motifs, promoters and TF binding sites (TFBSs) in promoter regions across model organisms. Its binding site predictions were obtained with rigorous Bayesian probabilistic methods that operate on orthologous regions from related genomes, and use explicit evolutionary models to assess the evidence of purifying selection on each site. New in the current version of SwissRegulon is a curated collection of 190 mammalian regulatory motifs associated with ∼340 TFs, and TFBS annotations across a curated set of ∼35 000 promoters in both human and mouse. Predictions of TFBSs for Saccharomyces cerevisiae have also been significantly extended and now cover 158 of yeast’s ∼180 TFs. All data are accessible through both an easily navigable genome browser with search functions, and as flat files that can be downloaded for further analysis.

[1]  Edgar Wingender,et al.  PRODORIC: prokaryotic database of gene regulation , 2003, Nucleic Acids Res..

[2]  Z. Weng,et al.  High-Resolution Mapping and Characterization of Open Chromatin across the Genome , 2008, Cell.

[3]  Martha L Bulyk,et al.  Protein binding microarrays for the characterization of DNA-protein interactions. , 2007, Advances in biochemical engineering/biotechnology.

[4]  Ting Wang,et al.  Combining phylogenetic data with co-regulated genes to identify regulatory motifs , 2003, Bioinform..

[5]  Martin S. Taylor,et al.  The transcriptional network that controls growth arrest and differentiation in a human myeloid leukemia cell line , 2009, Nature Genetics.

[6]  J. Liu,et al.  Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes. , 2001, Nucleic acids research.

[7]  Cedric Notredame,et al.  Computing Multiple Sequence/Structure Alignments with the T‐Coffee Package , 2003, Current protocols in bioinformatics.

[8]  Michael Q. Zhang,et al.  SCPD: a promoter database of the yeast Saccharomyces cerevisiae , 1999, Bioinform..

[9]  Saurabh Sinha,et al.  A probabilistic method to detect regulatory modules , 2003, ISMB.

[10]  Ariel S. Schwartz,et al.  An Atlas of Combinatorial Transcriptional Regulation in Mouse and Man , 2010, Cell.

[11]  C. Ball,et al.  Saccharomyces Genome Database. , 2002, Methods in enzymology.

[12]  Casey M. Bergman,et al.  Drosophila DNase I footprint database: a systematic genome annotation of transcription factor binding sites in the fruitfly, Drosophila melanogaster , 2005, Bioinform..

[13]  B. Birren,et al.  Sequencing and comparison of yeast species to identify genes and regulatory elements , 2003, Nature.

[14]  Nikolaus Rajewsky,et al.  Correlating Gene Expression Variation with cis-Regulatory Polymorphism in Saccharomyces cerevisiae , 2010, Genome biology and evolution.

[15]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[16]  Mikhail Pachkov,et al.  MotEvo: integrated Bayesian probabilistic methods for inferring regulatory sites and motifs on multiple alignments of DNA sequences , 2012, Bioinform..

[17]  Wyeth W. Wasserman,et al.  A new generation of JASPAR, the open-access repository for transcription factor binding site profiles , 2005, Nucleic Acids Res..

[18]  Carsten O. Daub,et al.  Update of the FANTOM web resource: from mammalian transcriptional landscape to its dynamic regulation , 2010, Nucleic Acids Res..

[19]  Erik van Nimwegen,et al.  SwissRegulon: a database of genome-wide annotations of regulatory sites , 2006, Nucleic Acids Res..

[20]  Lihua Liu,et al.  TRED: a Transcriptional Regulatory Element Database and a platform for in silico gene regulation studies , 2004, Nucleic Acids Res..

[21]  Erik van Nimwegen,et al.  PhyloGibbs: A Gibbs Sampling Motif Finder That Incorporates Phylogeny , 2005, PLoS Comput. Biol..

[22]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[23]  Raja Jothi,et al.  Genome-wide identification of in vivo protein–DNA binding sites from ChIP-Seq data , 2008, Nucleic acids research.

[24]  Sarah A. Teichmann,et al.  DBD––taxonomically broad transcription factor predictions: new content and functionality , 2007, Nucleic Acids Res..

[25]  Ole Winther,et al.  JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update , 2007, Nucleic Acids Res..

[26]  David Haussler,et al.  The UCSC genome browser database: update 2007 , 2006, Nucleic Acids Res..

[27]  Ramana V. Davuluri,et al.  AGRIS: Arabidopsis Gene Regulatory Information Server, an information resource of Arabidopsis cis-regulatory elements and transcription factors , 2003, BMC Bioinformatics.

[28]  Erik van Nimwegen,et al.  Universal patterns of purifying selection at noncoding positions in bacteria. , 2007, Genome research.

[29]  L. Fulton,et al.  Finding Functional Features in Saccharomyces Genomes by Phylogenetic Footprinting , 2003, Science.

[30]  E. van Nimwegen,et al.  Probabilistic clustering of sequences: Inferring new bacterial regulons by comparative genomics , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Alan M. Moses,et al.  MONKEY: identifying conserved transcription-factor binding sites in multiple alignments using a binding site-specific evolutionary model , 2004, Genome Biology.

[32]  Enrique Blanco,et al.  ABS: a database of Annotated regulatory Binding Sites from orthologous promoters , 2005, Nucleic Acids Res..

[33]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[34]  Nikolaus Rajewsky,et al.  The evolution of DNA regulatory regions for proteo-gamma bacteria by interspecies comparisons. , 2002, Genome research.

[35]  Julio Collado-Vides,et al.  RegulonDB (version 3.0): transcriptional regulation and operon organization in Escherichia coli K-12 , 2000, Nucleic Acids Res..

[36]  Lior Pachter,et al.  VISTA: computational tools for comparative genomics , 2004, Nucleic Acids Res..

[37]  Holger Karas,et al.  TRANSFAC: a database on transcription factors and their DNA binding sites , 1996, Nucleic Acids Res..

[38]  Piotr J. Balwierz,et al.  Methods for analyzing deep sequencing expression data: constructing the human and mouse promoterome with deepCAGE data , 2009, Genome Biology.

[39]  E. Davidson Genomic Regulatory Systems , 2001 .

[40]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[41]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[42]  Lee Aaron Newberg,et al.  PhyloScan: identification of transcription factor binding sites using cross-species evidence , 2007, Algorithms for Molecular Biology.

[43]  Erik van Nimwegen,et al.  Statistical Features of Yeast ’ s Transcriptional Regulatory Code , 2006 .

[44]  Nathaniel D. Heintzman,et al.  Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome , 2007, Nature Genetics.

[45]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[46]  David Ghosh,et al.  Object oriented Transcription Factors Database (ooTFD) , 1999, Nucleic Acids Res..

[47]  Martin Vingron,et al.  CORG: a database for COmparative Regulatory Genomics , 2003, Nucleic Acids Res..

[48]  Michael B. Eisen,et al.  Phylogenetic Motif Detection by Expectation-Maximization on Evolutionary Mixtures , 2003, Pacific Symposium on Biocomputing.

[49]  Ionas Erb,et al.  Transcription Factor Binding Site Positioning in Yeast: Proximal Promoter Motifs Characterize TATA-Less Promoters , 2011, PloS one.

[50]  L. Kraal,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2009 .

[51]  S. Lewis,et al.  The generic genome browser: a building block for a model organism system database. , 2002, Genome research.

[52]  Vijay K. Tiwari,et al.  DNA-binding factors shape the mouse methylome at distal regulatory regions , 2011, Nature.

[53]  Ronald W. Davis,et al.  A high-resolution atlas of nucleosome occupancy in yeast , 2007, Nature Genetics.

[54]  G. Stormo,et al.  Identifying protein-binding sites from unaligned DNA fragments. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[55]  Piero Carninci,et al.  Additional data file 5 , 2003 .

[56]  Mathieu Blanchette,et al.  PhyME: A probabilistic algorithm for finding motifs in sets of orthologous sequences , 2004, BMC Bioinformatics.

[57]  Mary Goldman,et al.  The UCSC Genome Browser database: update 2011 , 2010, Nucleic Acids Res..

[58]  Ting Wang,et al.  An improved map of conserved regulatory sites for Saccharomyces cerevisiae , 2006, BMC Bioinformatics.

[59]  Erik van Nimwegen,et al.  Finding regulatory elements and regulatory motifs: a general probabilistic framework , 2007, BMC Bioinformatics.

[60]  Kenta Nakai,et al.  BTBS: database of transcriptional regulation in Bacillus subtilis and its contribution to comparative genomics , 2004, Nucleic Acids Res..

[61]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[62]  Julio Collado-Vides,et al.  RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions , 2005, Nucleic Acids Res..