A protein–protein interaction guided method for competitive transcription factor binding improves target predictions

An important milestone in revealing cells' functions is to build a comprehensive understanding of transcriptional regulation processes. These processes are largely regulated by transcription factors (TFs) binding to DNA sites. Several TF binding site (TFBS) prediction methods have been developed, but they usually model binding of a single TF at a time albeit few methods for predicting binding of multiple TFs also exist. In this article, we propose a probabilistic model that predicts binding of several TFs simultaneously. Our method explicitly models the competitive binding between TFs and uses the prior knowledge of existing protein–protein interactions (PPIs), which mimics the situation in the nucleus. Modeling DNA binding for multiple TFs improves the accuracy of binding site prediction remarkably when compared with other programs and the cases where individual binding prediction results of separate TFs have been combined. The traditional TFBS prediction methods usually predict overwhelming number of false positives. This lack of specificity is overcome remarkably with our competitive binding prediction method. In addition, previously unpredictable binding sites can be detected with the help of PPIs. Source codes are available at http://www.cs.tut.fi/∼harrila/.

[1]  A. Sandelin,et al.  Applied bioinformatics for the identification of regulatory elements , 2004, Nature Reviews Genetics.

[2]  Markella Ponticos,et al.  Regulation of Collagen Type I in Vascular Smooth Muscle Cells by Competition between Nkx2.5 and δEF1/ZEB1 , 2004, Molecular and Cellular Biology.

[3]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[4]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[5]  R Staden Computer methods to locate signals in nucleic acid sequences , 1984, Nucleic Acids Res..

[6]  J. Shendure,et al.  Discovering functional transcription-factor combinations in the human cell cycle. , 2005, Genome research.

[7]  Enrique Blanco,et al.  ABS: a database of Annotated regulatory Binding Sites from orthologous promoters , 2005, Nucleic Acids Res..

[8]  Ian M. Donaldson,et al.  The Biomolecular Interaction Network Database and related tools 2005 update , 2004, Nucleic Acids Res..

[9]  Wyeth W. Wasserman,et al.  MSCAN: identification of functional clusters of transcription factor binding sites , 2004, Nucleic Acids Res..

[10]  H. Lähdesmäki,et al.  Probabilistic Inference of Transcription Factor Binding from Multiple Data Sources , 2008, PloS one.

[11]  T. Werner,et al.  MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data. , 1995, Nucleic acids research.

[12]  Sridhar Hannenhalli,et al.  Eukaryotic transcription factor binding sites - modeling and integrative search methods , 2008, Bioinform..

[13]  F. Rosenbauer,et al.  Role of Transcription Factors C/EBPα and PU.1 in Normal Hematopoiesis and Leukemia , 2005, International journal of hematology.

[14]  Saurabh Sinha,et al.  A probabilistic method to detect regulatory modules , 2003, ISMB.

[15]  M. Reitman,et al.  Regulation of leptin promoter function by Sp1, C/EBP, and a novel factor. , 1998, Endocrinology.

[16]  Andreas Wagner,et al.  Genes regulated cooperatively by one or more transcription factors and their identification in whole eukaryotic genomes , 1999, Bioinform..

[17]  Alexander J. Hartemink,et al.  A Nucleosome-Guided Map of Transcription Factor Binding Sites in Yeast , 2007, PLoS Comput. Biol..

[18]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[19]  E. Segal,et al.  Predicting expression patterns from regulatory sequence in Drosophila segmentation , 2008, Nature.

[20]  Massimo Vergassola,et al.  Computational detection of genomic cis-regulatory modules applied to body patterning in the early Drosophila embryo , 2002, BMC Bioinformatics.

[21]  Tommi S. Jaakkola,et al.  On the Dirichlet Prior and Bayesian Regularization , 2002, NIPS.

[23]  Obi L. Griffith,et al.  ORegAnno: an open-access community-driven resource for regulatory annotation , 2007, Nucleic Acids Res..

[24]  Promoter Elements of the Mouse Acetylcholinesterase Gene , 1995, The Journal of Biological Chemistry.

[25]  P. Taylor,et al.  Promoter Elements of the Mouse Acetylcholinesterase Gene , 1995, The Journal of Biological Chemistry.