P-Match: transcription factor binding site search by combining patterns and weight matrices

P-Match is a new tool for identifying transcription factor (TF) binding sites in DNA sequences. It combines pattern matching and weight matrix approaches thus providing higher accuracy of recognition than each of the methods alone. P-Match is closely interconnected with the TRANSFAC® database. In particular, P-Match uses the matrix library as well as sets of aligned known TF-binding sites collected in TRANSFAC® and therefore provides the possibility to search for a large variety of different TF binding sites. Using results of extensive tests of recognition accuracy, we selected three sets of optimized cut-off values that minimize either false negatives or false positives, or the sum of both errors. Comparison with the weight matrix approaches such as Match™ tool shows that P-Match generally provides superior recognition accuracy in the area of low false negative errors (high sensitivity). As familiar to the user of Match™, P-Match also allows to save user-specific profiles that include selected subsets of matrices with corresponding TF-binding sites or user-defined cut-off values. Furthermore, a number of tissue-specific profiles are provided that were compiled by the TRANSFAC® team. A public version of the P-Match tool is available at .

[1]  E. Wingender,et al.  Recognition of NFATp/AP-1 composite elements within genes induced upon the activation of immune cells. , 1999, Journal of molecular biology.

[2]  J. Fickett,et al.  Discovery and modeling of transcriptional regulatory regions. , 2000, Current opinion in biotechnology.

[3]  Niels Grabe,et al.  AliBaba2: Context specific identification of transcription factor binding sites , 2000, Silico Biol..

[4]  Alexander E. Kel,et al.  MATCHTM: a tool for searching transcription factor binding sites in DNA sequences , 2003, Nucleic Acids Res..

[5]  Alexander E. Kel,et al.  Computer Tool FUNSITE for Analysis of Eukaryotic Regulatory Genomic Sequences , 1995, ISMB.

[6]  Edgar Wingender,et al.  TRANSPATH: An integrated database on signal transduction and a tool for array analysis , 2003, Nucleic Acids Res..

[7]  Gary D. Stormo,et al.  MATRIX SEARCH 1.0: a computer program that scans DNA sequences for transcriptional elements using a database of weight matrices , 1995, Comput. Appl. Biosci..

[8]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[9]  T. Werner,et al.  MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data. , 1995, Nucleic acids research.

[10]  Xin Chen,et al.  Deriving an ontology for human gene expression sources from the CYTOMER? database on human organs and cell types , 2004, Silico Biol..

[11]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[12]  Andrew I Su,et al.  Genome-wide analysis of CREB target genes reveals a core promoter requirement for cAMP responsiveness. , 2003, Molecular cell.

[13]  Frank Klawonn,et al.  Transcription regulatory region analysis using signal detection and fuzzy clustering , 1998, Bioinform..

[14]  P. Bucher,et al.  Regulatory elements and expression profiles. , 1999, Current opinion in structural biology.

[15]  Alexander E. Kel,et al.  Databases and Tools for in silico Analysis of Regulation of Gene Expression , 2005 .

[16]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[17]  Brian P. Brunk,et al.  EpoDB: a prototype database for the analysis of genes expressed during vertebrate erythropoiesis , 1999, Nucleic Acids Res..

[18]  Alexander E. Kel,et al.  TRANSCompel®: a database on composite regulatory elements in eukaryotic genes , 2002, Nucleic Acids Res..

[19]  Dan S. Prestridge,et al.  SIGNAL SCAN 4.0: additional databases and sequence formats , 1996, Comput. Appl. Biosci..