Correlated evolution of transcription factors and their binding sites

MOTIVATION The interaction between transcription factor (TF) and transcription factor binding site (TFBS) is essential for gene regulation. Mutation in either the TF or the TFBS may weaken their interaction and thus result in abnormalities. To maintain such vital interaction, a mutation in one of the interacting partners might be compensated by a corresponding mutation in its binding partner during the course of evolution. Confirming this co-evolutionary relationship will guide us in designing protein sequences to target a specific DNA sequence or in predicting TFBS for poorly studied proteins, or even correcting and rescuing disease mutations in clinical applications. RESULTS Based on six, publicly available, experimentally validated TF-TFBS binding datasets for the basic Helix-Loop-Helix (bHLH) family, Homeo family, High-Mobility Group (HMG) family and Transient Receptor Potential channels (TRP) family, we showed that the evolutions of the TFs and their TFBSs are significantly correlated across eukaryotes. We further developed a mutual information-based method to identify co-evolved protein residues and DNA bases. This research sheds light on the dynamic relationship between TF and TFBS during their evolution. The same principle and strategy can be applied to co-evolutionary studies on protein-DNA interactions in other protein families. AVAILABILITY All the datasets, scripts and other related files have been made freely available at: http://jjwanglab.org/co-evo. CONTACT junwen@uw.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  G. Stormo,et al.  Analysis of Homeodomain Specificities Allows the Family-wide Prediction of Preferred Recognition Sites , 2008, Cell.

[2]  Susan Jones,et al.  An overview of the basic helix-loop-helix proteins , 2004, Genome Biology.

[3]  A. Valencia,et al.  Similarity of phylogenetic trees as indicator of protein-protein interaction. , 2001, Protein engineering.

[4]  R. Campbell,et al.  Co-evolution of ligand-receptor pairs , 1994, Nature.

[5]  Daniel E. Newburger,et al.  A Multiparameter Network Reveals Extensive Divergence between C. elegans bHLH Transcription Factors , 2009, Cell.

[6]  Pak Chung Sham,et al.  FastPval: a fast and memory efficient program to calculate very low P-values from empirical distribution , 2010, Bioinform..

[7]  David J. Arenillas,et al.  JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles , 2009, Nucleic Acids Res..

[8]  A. Ferré-D’Amaré,et al.  Structure and function of the b/HLH/Z domain of USF. , 1994, The EMBO journal.

[9]  Lee Ann McCue,et al.  Making connections between novel transcription factors and their DNA motifs. , 2005, Genome research.

[10]  Valérie Ledent,et al.  Phylogenetic analysis of the human basic helix-loop-helix proteins , 2002, Genome Biology.

[11]  Jeffrey C. Miller,et al.  Highly efficient endogenous human gene correction using designed zinc-finger nucleases , 2005, Nature.

[12]  Dmitry A Rodionov,et al.  Structure and function of an ADP-ribose-dependent transcriptional regulator of NAD metabolism. , 2009, Structure.

[13]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[14]  S. Pietrokovski Searching databases of conserved sequence regions by aligning protein multiple-alignments. , 1996, Nucleic acids research.

[15]  Kay Hamacher,et al.  Estimating sufficient statistics in co-evolutionary analysis by mutual information , 2009, Comput. Biol. Chem..

[16]  Sridhar Hannenhalli,et al.  Eukaryotic transcription factor binding sites - modeling and integrative search methods , 2008, Bioinform..

[17]  F. Cohen,et al.  Co-evolution of proteins with their interaction partners. , 2000, Journal of molecular biology.

[18]  T. Ceska,et al.  The crystal structure of an intact human Max-DNA complex: new insights into mechanisms of transcriptional control. , 1997, Structure.

[19]  M. Skinner,et al.  Basic helix-loop-helix proteins can act at the E-box within the serum response element of the c-fos promoter to influence hormone-induced promoter activation in Sertoli cells. , 1999, Molecular endocrinology.

[20]  Alfonso Valencia,et al.  Protein co-evolution, co-adaptation and interactions , 2008, The EMBO journal.

[21]  W. Atchley,et al.  A natural classification of the basic helix-loop-helix class of transcription factors. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Sridhar Hannenhalli,et al.  MetaProm: a neural network based meta-predictor for alternative human promoter prediction , 2007, BMC Genomics.

[23]  Desiree Tillo,et al.  Codep: Maximizing co‐evolutionary interdependencies to discover interacting proteins , 2006, Proteins.

[24]  Carl O. Pabo,et al.  Crystal structure of MyoD bHLH domain-DNA complex: Perspectives on DNA recognition and implications for transcriptional activation , 1994, Cell.

[25]  T. Hwa,et al.  Identification of direct residue contacts in protein–protein interaction by message passing , 2009, Proceedings of the National Academy of Sciences.

[26]  Michael Q. Zhang,et al.  ChIP-Array: combinatory analysis of ChIP-seq/chip and microarray gene expression data to discover direct/indirect targets of a transcription factor , 2011, Nucleic Acids Res..

[27]  J. Gonzalez,et al.  Scoring docking models with evolutionary information , 2005, Proteins.

[28]  Anthony A. Philippakis,et al.  Predicting the binding preference of transcription factors to individual DNA k-mers , 2009, Bioinform..

[29]  Alfonso Valencia,et al.  TSEMA: interactive prediction of protein pairings between interacting families , 2006, Nucleic Acids Res..

[30]  C. Darwin On the Various Contrivances by Which British and Foreign Orchids Are Fertilized by Insects; and on the Good Effects of Intercrossing , 1862, The British and foreign medico-chirurgical review.

[31]  Sridhar Hannenhalli,et al.  A mammalian promoter model links cis elements to genetic networks. , 2006, Biochemical and biophysical research communications.

[32]  Y. Kyōgoku,et al.  Crystal structure of PHO4 bHLH domain–DNA complex: flanking base recognition , 1997, The EMBO journal.

[33]  References , 1971 .

[34]  J. T. Kadonaga,et al.  The RNA polymerase II core promoter - the gateway to transcription. , 2008, Current opinion in cell biology.

[35]  S. Atwell,et al.  Structural plasticity in a remodeled protein-protein interface. , 1997, Science.

[36]  Gregory B. Gloor,et al.  Mutual information is critically dependent on prior assumptions: would the correct estimate of mutual information please identify itself? , 2010, Bioinform..

[37]  A. Valencia,et al.  Correlated mutations contain information about protein-protein interaction. , 1997, Journal of molecular biology.