Positional Dependence, Cliques, and Predictive Motifs in the bHLH Protein Domain

Abstract. Quantitative analyses were carried out on a large number of proteins that contain the highly conserved basic helix–loop–helix domain. Measures derived from information theory were used to examine the extent of conservation at amino acid sites within the bHLH domain as well as the extent of mutual information among sites within the domain. Using the Boltzmann entropy measure, we described the extent of amino acid conservation throughout the bHLH domain. We used position association (pa) statistics that reflect the joint probability of occurrence of events to estimate the ``mutual information content'' among distinct amino acid sites. Further, we used pa statistics to estimate the extent of association in amino acid composition at each site in the domain and between amino acid composition and variables reflecting clade and group membership, loop length, and the presence of a leucine zipper. The pa values were also used to describe groups of amino acid sites called ``cliques'' that were highly associated with each other. Finally, a predictive motif was constructed that accurately identifies bHLH domain-containing proteins that belong to Groups A and B.

[1]  Ramón Román-Roldán,et al.  Application of information theory to DNA sequence analysis: A review , 1996, Pattern Recognit..

[2]  Carl O. Pabo,et al.  Crystal structure of MyoD bHLH domain-DNA complex: Perspectives on DNA recognition and implications for transcriptional activation , 1994, Cell.

[3]  B. Shilo,et al.  The PAS domain confers target gene specificity of Drosophila bHLH/PAS proteins. , 1997, Genes & development.

[4]  C. Dang,et al.  Discrimination between related DNA sites by a single amino acid residue of Myc-related basic-helix-loop-helix proteins. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[5]  A. Harris Genes VI , 1997 .

[6]  R. Benezra,et al.  The loop region of the helix-loop-helix protein Id1 is critical for its dominant negative activity , 1993, Molecular and cellular biology.

[7]  David Baltimore,et al.  A new DNA binding and dimerization motif in immunoglobulin enhancer binding, daughterless, MyoD, and myc proteins , 1989, Cell.

[8]  Udi Manber,et al.  Fast text searching: allowing errors , 1992, CACM.

[9]  D. Baltimore,et al.  Mutations that disrupt DNA binding and dimer formation in the E47 helix-loop-helix protein map to distinct domains. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[10]  A. Ferré-D’Amaré,et al.  Structure and function of the b/HLH/Z domain of USF , 1994 .

[11]  Stephen K. Burley,et al.  Recognition by Max of its cognate DNA through a dimeric b/HLH/Z domain , 1993, Nature.

[12]  A. Roy,et al.  Core promoters and transcriptional control. , 1996, Trends in genetics : TIG.

[13]  N D Clarke,et al.  Covariation of residues in the homeodomain sequence family , 1995, Protein science : a publication of the Protein Society.

[14]  Wentian Li,et al.  Understanding long-range correlations in DNA sequences , 1994, chao-dyn/9403002.

[15]  K. Nakata,et al.  Prediction of zinc finger DNA binding protein , 1995, Comput. Appl. Biosci..

[16]  S. Harrison,et al.  Crystal structure of transcription factor E47: E-box recognition by a basic region helix-loop-helix dimer. , 1994, Genes & development.

[17]  C. Goding,et al.  Single amino acid substitutions alter helix‐loop‐helix protein specificity for bases flanking the core CANNTG motif. , 1992, The EMBO journal.

[18]  H. Swanson,et al.  DNA Binding Specificities and Pairing Rules of the Ah Receptor, ARNT, and SIM Proteins (*) , 1995, The Journal of Biological Chemistry.

[19]  A. Lapedes,et al.  Determination of eukaryotic protein coding regions using neural networks and information theory. , 1992, Journal of molecular biology.

[20]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[21]  I. Grosse,et al.  MEASURING CORRELATIONS IN SYMBOL SEQUENCES , 1995 .

[22]  J R Matthews,et al.  Structure and function of helix-loop-helix proteins. , 1994, Biochimica et biophysica acta.

[23]  S. Crews,et al.  Control of Cell Lineage-specific Development and Transcription by Bhlh–pas Proteins , 2022 .

[24]  W. Atchley,et al.  A natural classification of the basic helix-loop-helix class of transcription factors. , 1997, Proceedings of the National Academy of Sciences of the United States of America.