Identifying transcription factor complexes and their roles

Motivation: Eukaryotic gene expression is controlled through molecular logic circuits that combine regulatory signals of many different factors. In particular, complexation of transcription factors (TFs) and other regulatory proteins is a prevailing and highly conserved mechanism of signal integration within critical regulatory pathways and enables us to infer controlled genes as well as the exerted regulatory mechanism. Common approaches for protein complex prediction that only use protein interaction networks, however, are designed to detect self-contained functional complexes and have difficulties to reveal dynamic combinatorial assemblies of physically interacting proteins. Results: We developed the novel algorithm DACO that combines protein–protein interaction networks and domain–domain interaction networks with the cluster-quality metric cohesiveness. The metric is locally maximized on the holistic level of protein interactions, and connectivity constraints on the domain level are used to account for the exclusive and thus inherently combinatorial nature of the interactions within such assemblies. When applied to predicting TF complexes in the yeast Saccharomyces cerevisiae, the proposed approach outperformed popular complex prediction methods by far. Furthermore, we were able to assign many of the predictions to target genes, as well as to a potential regulatory effect in agreement with literature evidence. Availability and implementation: A prototype implementation is freely available at https://sourceforge.net/projects/dacoalgorithm/. Contact: volkhard.helms@bioinformatik.uni-saarland.de Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Gary D Bader,et al.  A travel guide to Cytoscape plugins , 2012, Nature Methods.

[2]  Nagiza F. Samatova,et al.  From pull-down data to protein interaction networks and complexes with biological relevance. , 2008, Bioinformatics.

[3]  Anna R Panchenko,et al.  Exploring functional roles of multibinding protein interfaces , 2009, Protein science : a publication of the Protein Society.

[4]  E. Furlong,et al.  Transcription factors: from enhancer binding to developmental control , 2012, Nature Reviews Genetics.

[5]  María Martín,et al.  Activities at the Universal Protein Resource (UniProt) , 2013, Nucleic Acids Res..

[6]  Ozlem Keskin,et al.  Similar binding sites and different partners: implications to shared proteins in cellular pathways. , 2007, Structure.

[7]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[8]  Li Zhang,et al.  Regulation of the HAP1 gene involves positive actions of histone deacetylases. , 2007, Biochemical and biophysical research communications.

[9]  Philip M. Kim,et al.  Relating Three-Dimensional Structures to Protein Networks Provides Evolutionary Insights , 2006, Science.

[10]  Ian M. Marcus,et al.  Dynamics of oscillatory phenotypes in Saccharomyces cerevisiae reveal a network of genome‐wide transcriptional oscillators , 2012, The FEBS journal.

[11]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[12]  Sailu Yellaboina,et al.  DOMINE: a comprehensive collection of known and predicted domain-domain interactions , 2010, Nucleic Acids Res..

[13]  Haiyuan Yu,et al.  Detecting overlapping protein complexes in protein-protein interaction networks , 2012, Nature Methods.

[14]  Diego Miranda-Saavedra,et al.  Systematic identification of transcriptional regulatory modules from protein–protein interaction networks , 2013, Nucleic acids research.

[15]  F. Cross,et al.  Multiple sequence-specific factors generate the nucleosome-depleted region on CLN2 promoter. , 2011, Molecular cell.

[16]  Jacques van Helden,et al.  Evaluation of clustering algorithms for protein-protein interaction networks , 2006, BMC Bioinformatics.

[17]  Bumki Min,et al.  IDDI: integrated domain-domain interaction and protein interaction analysis system , 2012, Proteome Science.

[18]  Darby Tien-Hao Chang,et al.  YPA: an integrated repository of promoter features in Saccharomyces cerevisiae , 2010, Nucleic Acids Res..

[19]  K. Hochedlinger,et al.  Epigenetic reprogramming and induced pluripotency , 2009, Development.

[20]  Dong-Soo Han,et al.  Protein complex prediction based on simultaneous protein interaction network , 2010, Bioinform..

[21]  Caroline C. Friedel,et al.  Bootstrapping the Interactome: Unsupervised Identification of Protein Complexes in Yeast , 2008, J. Comput. Biol..

[22]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[23]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[24]  Robert D. Finn,et al.  InterPro in 2011: new developments in the family and domain prediction database , 2011, Nucleic acids research.

[25]  S. Pu,et al.  Up-to-date catalogues of yeast protein complexes , 2008, Nucleic acids research.

[26]  J. Zeitlinger,et al.  High conservation of transcription factor binding and evidence for combinatorial regulation across six Drosophila species , 2011, Nature Genetics.

[27]  Martin Vingron,et al.  Combinatorial Binding in Human and Mouse Embryonic Stem Cells Identifies Conserved Enhancers Active in Early Embryonic Development , 2011, PLoS Comput. Biol..

[28]  Michael Grunstein,et al.  Genome-wide patterns of histone modifications in yeast , 2006, Nature Reviews Molecular Cell Biology.

[29]  R. Russell,et al.  Structural systems biology: modelling protein interactions , 2006, Nature Reviews Molecular Cell Biology.

[30]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2006, Nucleic Acids Res..

[31]  M. Gerstein,et al.  Relating whole-genome expression data with protein-protein interactions. , 2002, Genome research.

[32]  Saurabh Sinha,et al.  Program in Gene Function and Expression Publications and Presentations Program in Gene Function and Expression 9-2013 Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development , 2014 .

[33]  Rodrigo Lopez,et al.  Public services from the European Bioinformatics Institute , 2003, Briefings Bioinform..

[34]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[35]  Lusheng Wang,et al.  Protein complex prediction based on maximum matching with domain-domain interaction. , 2012, Biochimica et biophysica acta.

[36]  Sorin Istrail,et al.  Logic Functions of the Genomic Cis-regulatory Code , 2005, UC.

[37]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[38]  Zhengjian Zhang,et al.  Molecular Genetic Analysis of the Yeast Repressor Rfx1/Crt1 Reveals a Novel Two-Step Regulatory Mechanism , 2005, Molecular and Cellular Biology.

[39]  Martin Vingron,et al.  Correlating protein-DNA and protein-protein interaction networks. , 2003, Journal of molecular biology.

[40]  Lei Deng,et al.  PrePPI: a structure-informed database of protein–protein interactions , 2012, Nucleic Acids Res..

[41]  W. Ouwehand,et al.  Combinatorial transcriptional control in blood stem/progenitor cells: genome-wide analysis of ten major transcriptional regulators. , 2010, Cell stem cell.

[42]  Hisashi Kashima,et al.  Protein complex prediction via verifying and reconstructing the topology of domain-domain interactions , 2010, BMC Bioinformatics.

[43]  Rodrigo Lopez,et al.  Public services from the , 2003 .

[44]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2005, Nucleic Acids Res..

[45]  Daniel Aguilar,et al.  Topological comparison of methods for predicting transcriptional cooperativity in yeast , 2008, BMC Genomics.

[46]  R. Tjian,et al.  Transcription regulation and animal diversity , 2003, Nature.

[47]  Ozlem Keskin,et al.  Towards inferring time dimensionality in protein–protein interaction networks by integrating structures: the p53 example† †This article is part of a Molecular BioSystems themed issue on Computational and Systems Biology. , 2009, Molecular bioSystems.

[48]  Nicolas E. Buchler,et al.  On schemes of combinatorial transcription logic , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[49]  E. Birney,et al.  A Transcription Factor Collective Defines Cardiac Cell Fate and Reflects Lineage History , 2012, Cell.

[50]  G. Church,et al.  Identifying regulatory networks by combinatorial analysis of promoter elements , 2001, Nature Genetics.

[51]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[52]  R. Gordân,et al.  Protein–DNA binding: complexities and multi-protein codes , 2013, Nucleic acids research.

[53]  Gabriel Kreiman,et al.  Conservation of transcription factor binding events predicts gene expression across species , 2011, Nucleic acids research.

[54]  Guandong Wang,et al.  A steganalysis-based approach to comprehensive identification and characterization of functional regulatory elements , 2006, Genome Biology.

[55]  Xiaoli Li,et al.  Computational approaches for detecting protein complexes from protein interaction networks: a survey , 2010, BMC Genomics.

[56]  Nicola J. Rinaldi,et al.  Serial Regulation of Transcriptional Regulators in the Yeast Cell Cycle , 2001, Cell.

[57]  Edith D. Wong,et al.  Saccharomyces Genome Database: the genomics resource of budding yeast , 2011, Nucleic Acids Res..