ABS: a database of Annotated regulatory Binding Sites from orthologous promoters

Information about the genomic coordinates and the sequence of experimentally identified transcription factor binding sites is found scattered under a variety of diverse formats. The availability of standard collections of such high-quality data is important to design, evaluate and improve novel computational approaches to identify binding motifs on promoter sequences from related genes. ABS () is a public database of known binding sites identified in promoters of orthologous vertebrate genes that have been manually curated from bibliography. We have annotated 650 experimental binding sites from 68 transcription factors and 100 orthologous target genes in human, mouse, rat or chicken genome sequences. Computational predictions and promoter alignment information are also provided for each entry. A simple and easy-to-use web interface facilitates data retrieval allowing different views of the information. In addition, the release 1.0 of ABS includes a customizable generator of artificial datasets based on the known sites contained in the collection and an evaluation tool to aid during the training and the assessment of motif-finding programs.

[1]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[2]  M. Blanchette,et al.  Discovery of regulatory elements by a computational method for phylogenetic footprinting. , 2002, Genome research.

[3]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[4]  Wyeth W. Wasserman,et al.  JASPAR: an open-access database for eukaryotic transcription factor binding profiles , 2004, Nucleic Acids Res..

[5]  A. Sandelin,et al.  Applied bioinformatics for the identification of regulatory elements , 2004, Nature Reviews Genetics.

[6]  Roderic Guigó,et al.  Gff2ps: Visualizing Genomic Annotations , 2000, Bioinform..

[7]  Chuong B. Do,et al.  Access the most recent version at doi: 10.1101/gr.926603 References , 2003 .

[8]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[9]  Kenta Nakai,et al.  BTSS, DataBase of Transcriptional Start Sites: progress report 2004 , 2004, Nucleic Acids Res..

[10]  A. Clark,et al.  Evolution of transcription factor binding sites in Mammalian gene regulatory regions: conservation and turnover. , 2002, Molecular biology and evolution.

[11]  W. Miller,et al.  Distinguishing regulatory DNA from neutral sites. , 2003, Genome research.

[12]  Nicholas L. Bray,et al.  AVID: A global alignment program. , 2003, Genome research.

[13]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[14]  Lihua Liu,et al.  TRED: a Transcriptional Regulatory Element Database and a platform for in silico gene regulation studies , 2004, Nucleic Acids Res..

[15]  Gábor Tóth,et al.  DoOP: Databases of Orthologous Promoters, collections of clusters of orthologous upstream sequences from chordates and plants , 2004, Nucleic Acids Res..

[16]  Mario Huerta,et al.  Identification of patterns in biological sequences at the ALGGEN server: PROMO and MALGEN , 2003, Nucleic Acids Res..

[17]  R. Guigó,et al.  Evaluation of gene structure prediction programs. , 1996, Genomics.

[18]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[19]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.