CollecTF: a database of experimentally validated transcription factor-binding sites in Bacteria

The influx of high-throughput data and the need for complex models to describe the interaction of prokaryotic transcription factors (TF) with their target sites pose new challenges for TF-binding site databases. CollecTF (http://collectf.umbc.edu) compiles data on experimentally validated, naturally occurring TF-binding sites across the Bacteria domain, placing a strong emphasis on the transparency of the curation process, the quality and availability of the stored data and fully customizable access to its records. CollecTF integrates multiple sources of data automatically and openly, allowing users to dynamically redefine binding motifs and their experimental support base. Data quality and currency are fostered in CollecTF by adopting a sustainable model that encourages direct author submissions in combination with in-house validation and curation of published literature. CollecTF entries are periodically submitted to NCBI for integration into RefSeq complete genome records as link-out features, maximizing the visibility of the data and enriching the annotation of RefSeq files with regulatory information. Seeking to facilitate comparative genomics and machine-learning analyses of regulatory interactions, in its initial release CollecTF provides domain-wide coverage of two TF families (LexA and Fur), as well as extensive representation for a clinically important bacterial family, the Vibrionaceae.

[1]  Chih Lee,et al.  LASAGNA: A novel algorithm for transcription factor binding site alignment , 2013, BMC Bioinformatics.

[2]  Obi L. Griffith,et al.  ORegAnno: an open-access community-driven resource for regulatory annotation , 2007, Nucleic Acids Res..

[3]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[4]  Kenta Nakai,et al.  DBTBS: a database of transcriptional regulation in Bacillus subtilis containing upstream intergenic conservation information , 2007, Nucleic Acids Res..

[5]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[6]  M. Sagot,et al.  Promoter sequences and algorithmical methods for identifying them. , 1999, Research in microbiology.

[7]  O. Kuipers,et al.  Mechanisms and Evolution of Control Logic in Prokaryotic Transcriptional Regulation , 2009, Microbiology and Molecular Biology Reviews.

[8]  Dieter Jahn,et al.  PRODORIC (release 2009): a database and tool platform for the analysis of gene regulation in prokaryotes , 2008, Nucleic Acids Res..

[9]  Pierre-Étienne Jacques,et al.  MtbRegList, a database dedicated to the analysis of transcriptional regulation in Mycobacterium tuberculosis , 2005, Bioinform..

[10]  Victor J. DiRita,et al.  Regulatory Networks Controlling Vibrio cholerae Virulence Gene Expression , 2007, Infection and Immunity.

[11]  R. Blumenthal,et al.  Integration of regulatory signals through involvement of multiple global regulators: control of the Escherichia coli gltBDF operon by Lrp, IHF, Crp, and ArgR , 2007, BMC Microbiology.

[12]  Alex Bateman,et al.  Databases, data tombs and dust in the wind , 2008, Bioinform..

[13]  Fangping Mu,et al.  Improved predictions of transcription factor binding sites using physicochemical features of DNA , 2012, Nucleic acids research.

[14]  E. Mardis ChIP-seq: welcome to the new frontier , 2007, Nature Methods.

[15]  Panayiotis V. Benos,et al.  STAMP: a web tool for exploring DNA-binding motif similarities , 2007, Nucleic Acids Res..

[16]  Igor Zwir,et al.  Dissecting the PhoP regulatory network of Escherichia coli and Salmonella enterica. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Tatiana A. Tatusova,et al.  NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy , 2011, Nucleic Acids Res..

[18]  Julio Collado-Vides,et al.  RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more , 2012, Nucleic Acids Res..

[19]  Leelavati Narlikar,et al.  MuMoD: a Bayesian approach to detect multiple modes of protein–DNA binding from genome-wide ChIP data , 2012, Nucleic acids research.

[20]  Andreas Tauch,et al.  CoryneRegNet 6.0—Updated database content, new analysis methods and novel features focusing on community demands , 2011, Nucleic Acids Res..

[21]  Stephen Busby,et al.  Regulation at complex bacterial promoters: how bacteria use different promoter organizations to produce different regulatory outcomes. , 2004, Current opinion in microbiology.

[22]  Sandhya Mehrotra,et al.  Combinatorial Control of Gene Expression , 2013, BioMed research international.

[23]  Inna Dubchak,et al.  RegTransBase – a database of regulatory sequences and interactions based on literature: a resource for investigating transcriptional regulation in prokaryotes , 2013, BMC Genomics.

[24]  L. Mirny,et al.  Different gene regulation strategies revealed by analysis of binding motifs. , 2009, Trends in genetics : TIG.