Gibbs Recursive Sampler: finding transcription factor binding sites

The Gibbs Motif Sampler is a software package for locating common elements in collections of biopolymer sequences. In this paper we describe a new variation of the Gibbs Motif Sampler, the Gibbs Recursive Sampler, which has been developed specifically for locating multiple transcription factor binding sites for multiple transcription factors simultaneously in unaligned DNA sequences that may be heterogeneous in DNA composition. Here we describe the basic operation of the web-based version of this sampler. The sampler may be acces-sed at http://bayesweb.wadsworth.org/gibbs/gibbs.html and at http://www.bioinfo.rpi.edu/applications/bayesian/gibbs/gibbs.html. An online user guide is available at http://bayesweb.wadsworth.org/gibbs/bernoulli.html and at http://www.bioinfo.rpi.edu/applications/bayesian/gibbs/manual/bernoulli.html. Solaris, Solaris.x86 and Linux versions of the sampler are available as stand-alone programs for academic and not-for-profit users. Commercial licenses are also available. The Gibbs Recursive Sampler is distributed in accordance with the ISCB level 0 guidelines and a requirement for citation of use in scientific publications.

[1]  Rodger Staden,et al.  Methods for calculating the probabilities of finding patterns in sequences , 1989, Comput. Appl. Biosci..

[2]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[3]  A. A. Reilly,et al.  An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences , 1990, Proteins.

[4]  Jean-Michel Claverie,et al.  Information Enhancement Methods for Large Scale Sequence Analysis , 1993, Comput. Chem..

[5]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[6]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[7]  Jun S. Liu,et al.  Bayesian Models for Multiple Local Sequence Alignment and Gibbs Sampling Strategies , 1995 .

[8]  Jun S. Liu,et al.  Gibbs motif sampling: Detection of bacterial outer membrane protein repeats , 1995, Protein science : a publication of the Protein Society.

[9]  B. Wanner Phosphorus assimilation and control of the phosphate regulon , 1996 .

[10]  J. Wootton,et al.  Analysis of compositionally biased regions in sequence databases. , 1996, Methods in enzymology.

[11]  R. Twyman Advanced Molecular Biology , 1998 .

[12]  G. Church,et al.  A comprehensive library of DNA-binding site matrices for 55 proteins applied to the complete Escherichia coli K-12 genome. , 1998, Journal of molecular biology.

[13]  Jun S. Liu,et al.  Bayesian inference on biopolymer models , 1999, Bioinform..

[14]  G. Stormo,et al.  ANN-Spec: a method for discovering transcription factor binding sites with improved specificity. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[15]  Saurabh Sinha,et al.  A Statistical Method for Finding Transcription Factor Binding Sites , 2000, ISMB.

[16]  C. Lawrence,et al.  Human-mouse genome comparisons to locate regulatory sites , 2000, Nature Genetics.

[17]  J. Collado-Vides,et al.  A web site for the computational analysis of yeast regulatory sequences , 2000, Yeast.

[18]  G. Church,et al.  Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. , 2000, Journal of molecular biology.

[19]  Marie-France Sagot,et al.  Algorithms for Extracting Structured Motifs Using a Suffix Tree with an Application to Promoter and Regulatory Site Consensus Identification , 2000, J. Comput. Biol..

[20]  Kathleen Marchal,et al.  A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling , 2001, Bioinform..

[21]  J. Liu,et al.  Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes. , 2001, Nucleic acids research.

[22]  Douglas L. Brutlag,et al.  BioProspector: Discovering Conserved DNA Motifs in Upstream Regulatory Regions of Co-Expressed Genes , 2000, Pacific Symposium on Biocomputing.

[23]  Kathleen Marchal,et al.  A Gibbs sampling method to detect over-represented motifs in the upstream regions of co-expressed genes , 2001, RECOMB.

[24]  Eleazar Eskin,et al.  Finding composite regulatory patterns in DNA sequences , 2002, ISMB.

[25]  Jun S. Liu,et al.  BALSA: Bayesian algorithm for local sequence alignment. , 2002, Nucleic acids research.

[26]  C. Lawrence,et al.  Factors influencing the identification of transcription factor binding sites by cross-species comparison. , 2002, Genome research.

[27]  B. De Moor,et al.  Genome-specific higher-order background models to improve motif detection. , 2003, Trends in microbiology.