Statistical models for discerning protein structures containing the DNA-binding helix-turn-helix motif.

A method for discerning protein structures containing the DNA-binding helix-turn-helix (HTH) motif has been developed. The method uses statistical models based on geometrical measurements of the motif. With a decision tree model, key structural features required for DNA binding were identified. These include a high average solvent-accessibility of residues within the recognition helix and a conserved hydrophobic interaction between the recognition helix and the second alpha helix preceding it. The Protein Data Bank was searched using a more accurate model of the motif created using the Adaboost algorithm to identify structures that have a high probability of containing the motif, including those that had not been reported previously.

[1]  J. Gasteiger,et al.  ITERATIVE PARTIAL EQUALIZATION OF ORBITAL ELECTRONEGATIVITY – A RAPID ACCESS TO ATOMIC CHARGES , 1980 .

[2]  S. McKnight,et al.  The leucine zipper: a hypothetical structure common to a new class of DNA binding proteins. , 1988, Science.

[3]  L. Strizzi,et al.  Methionine aminopeptidase-2 regulates human mesothelioma cell survival: role of Bcl-2 expression and telomerase activity. , 2001, The American journal of pathology.

[4]  K. Struhl Histone acetylation and transcriptional regulatory mechanisms. , 1998, Genes & development.

[5]  M. Vignali,et al.  Distribution of acetylated histones resulting from Gal4‐VP16 recruitment of SAGA and NuA4 complexes , 2000, The EMBO journal.

[6]  Homeodomain-type DNA recognition. , 1996, Progress in biophysics and molecular biology.

[7]  A. Sali,et al.  Structural genomics: beyond the Human Genome Project , 1999, Nature Genetics.

[8]  C. Papanicolaou,et al.  Crystal structures of a template‐independent DNA polymerase: murine terminal deoxynucleotidyltransferase , 2002, The EMBO journal.

[9]  C. Zwieb,et al.  Crystal structure of the conserved subdomain of human protein SRP54M at 2.1 A resolution: evidence for the mechanism of signal peptide binding. , 1999, Journal of molecular biology.

[10]  S. Brenner,et al.  Classification of multi‐helical DNA‐binding domains and application to predict the DBD structures of σ factor, LysR, OmpR/PhoB, CENP‐B, Rap1, and XylS/Ada/AraC , 1995, FEBS letters.

[11]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[12]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[13]  R. Brennan,et al.  Crystal structure of the transcription activator BmrR bound to DNA and a drug , 2001, Nature.

[14]  Anthony Maxwell,et al.  Crystal structure of the breakage–reunion domain of DNA gyrase , 1997, Nature.

[15]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[16]  Maria Jesus Martin,et al.  High-quality Protein Knowledge Resource: SWISS-PROT and TrEMBL , 2002, Briefings Bioinform..

[17]  S. Phillips The beta-ribbon DNA recognition motif. , 1994, Annual review of biophysics and biomolecular structure.

[18]  S. Berger,et al.  Crystal structure of yeast Esa1 suggests a unified mechanism for catalysis and substrate binding by histone acetyltransferases. , 2000, Molecular cell.

[19]  I. Dodd,et al.  Improved detection of helix-turn-helix DNA-binding motifs in protein sequences. , 1990, Nucleic acids research.

[20]  Liisa Holm,et al.  Identification of homology in protein structure classification , 2001, Nature Structural Biology.

[21]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[22]  S. Harrison,et al.  DNA recognition by proteins with the helix-turn-helix motif. , 1990, Annual review of biochemistry.

[23]  J. Thornton,et al.  PROMOTIF—A program to identify and analyze structural motifs in proteins , 1996, Protein science : a publication of the Protein Society.

[24]  J. Tainer,et al.  Novel DNA binding motifs in the DNA repair enzyme endonuclease III crystal structure. , 1995, The EMBO journal.

[25]  G L Gilliland,et al.  Structural studies of the engrailed homeodomain , 1994, Protein science : a publication of the Protein Society.

[26]  C. Sander,et al.  Parser for protein folding units , 1994, Proteins.

[27]  S. Campuzano,et al.  The helix‐loop‐helix domain: A common motif for bristles, muscles and sex , 1991, BioEssays : news and reviews in molecular, cellular and developmental biology.

[28]  M J Sternberg,et al.  Automated discovery of structural signatures of protein fold and function. , 2001, Journal of molecular biology.

[29]  M. Rooman,et al.  Structural classification of HTH DNA-binding domains and protein-DNA interaction modes. , 1996, Journal of molecular biology.

[30]  A. Baucom,et al.  Predicting protein function from structure: unique structural features of proteases. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Samuel H. Wilson,et al.  Structures of ternary complexes of rat DNA polymerase beta, a DNA template-primer, and ddCTP. , 1994, Science.

[32]  S. Elledge,et al.  Structure of the Cul1–Rbx1–Skp1–F boxSkp2 SCF ubiquitin ligase complex , 2002, Nature.

[33]  T. Steitz,et al.  Structural similarity in the DNA-binding domains of catabolite gene activator and cro repressor proteins. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[34]  P. Brown,et al.  Coordinate regulation of yeast ribosomal protein genes is associated with targeted recruitment of Esa1 histone acetylase. , 2000, Molecular cell.

[35]  Richard A. Dixon,et al.  Structures of two natural product methyltransferases reveal the basis for substrate specificity in plant O-methyltransferases , 2001, Nature Structural Biology.

[36]  E. Lattman,et al.  Crystal structure of a conserved ribosomal protein-RNA complex. , 1999, Science.

[37]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[38]  A. Gronenborn,et al.  A novel class of winged helix-turn-helix protein: the DNA-binding domain of Mu transposase. , 1994, Structure.

[39]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[40]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[41]  Frances M. G. Pearl,et al.  The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologues. , 2000, Protein engineering.

[42]  A. Gronenborn,et al.  Solution structure of the cellular factor BAF responsible for protecting retroviral DNA from autointegration , 1998, Nature Structural Biology.

[43]  R. Ghirlando,et al.  Crystal structure of the Xrcc4 DNA repair protein and implications for end joining , 2000, The EMBO journal.

[44]  R A Sayle,et al.  RASMOL: biomolecular graphics for all. , 1995, Trends in biochemical sciences.

[45]  S K Burley,et al.  Winged helix proteins. , 2000, Current opinion in structural biology.

[46]  R. Brennan DNA recognition by the helix-turn-helix motif , 1992 .

[47]  R. Stevens,et al.  Global Efforts in Structural Genomics , 2001, Science.

[48]  Adam Godzik,et al.  Clustering of highly homologous sequences to reduce the size of large protein databases , 2001, Bioinform..

[49]  Jaime Prilusky,et al.  Automated analysis of interatomic contacts in proteins , 1999, Bioinform..

[50]  E. Koonin,et al.  DNA-binding proteins and evolution of transcription regulation in the archaea. , 1999, Nucleic acids research.

[51]  W R Taylor,et al.  SSAP: sequential structure alignment program for protein structure comparison. , 1996, Methods in enzymology.

[52]  Tim J. P. Hubbard,et al.  SCOP: a structural classification of proteins database , 1998, Nucleic Acids Res..

[53]  R. Brennan,et al.  Crystal Structure of MtaN, a Global Multidrug Transporter Gene Activator* , 2001, The Journal of Biological Chemistry.

[54]  P E Bourne,et al.  The Protein Data Bank. , 2002, Nucleic acids research.

[55]  K Nasmyth,et al.  Crystal structure of the DNA-binding domain of Mbp1, a transcription factor important in cell-cycle control of DNA synthesis. , 1997, Structure.

[56]  D. Hinton,et al.  Binding of the bacteriophage T4 transcriptional activator, MotA, to T4 middle promoter DNA: evidence for both major and minor groove contacts. , 1999, Journal of molecular biology.

[57]  R. Stroud,et al.  Crystal Structure of the Signal Sequence Binding Subunit of the Signal Recognition Particle , 1998, Cell.

[58]  J. Coleman,et al.  Zinc proteins: enzymes, storage proteins, transcription factors, and replication proteins. , 1992, Annual review of biochemistry.

[59]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[60]  H. Nelson,et al.  Structure and function of DNA-binding proteins. , 1995, Current opinion in genetics & development.

[61]  William R. Atchley,et al.  Molecular Evolution of Helix–Turn–Helix Proteins , 1999, Journal of Molecular Evolution.

[62]  R. Dickerson,et al.  Hin recombinase bound to DNA: the origin of specificity in major and minor groove interactions. , 1994, Science.

[63]  M. Gerstein,et al.  DNA recognition and superstructure formation by helix-turn-helix proteins. , 1995, Protein engineering.

[64]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[65]  M. Belfort,et al.  Intertwined structure of the DNA‐binding domain of intron endonuclease I‐TevI with its substrate , 2001, The EMBO journal.

[66]  R. Sauer,et al.  Transcription factors: structural families and principles of DNA recognition. , 1992, Annual review of biochemistry.

[67]  K. Taylor,et al.  Crystal structure of the cyanobacterial metallothionein repressor SmtB: a model for metalloregulatory proteins. , 1998, Journal of molecular biology.

[68]  R. Dixon,et al.  Affinity chromatography, substrate/product specificity, and amino acid sequence analysis of an isoflavone O-methyltransferase from alfalfa (Medicago sativa L.). , 1996, Archives of biochemistry and biophysics.

[69]  R. Blumenthal,et al.  Structure of pvu II DNA-(cytosine N4) methyltransferase, an example of domain permutation and protein fold assignment. , 1997, Nucleic acids research.

[70]  F. Guo,et al.  Structure of the Holliday junction intermediate in Cre–loxP site‐specific recombination , 1998, The EMBO journal.

[71]  I. Tanaka,et al.  Ribosomal protein S7: a new RNA-binding motif with structural similarities to a DNA architectural factor. , 1997, Structure.

[72]  D. Hall,et al.  The high‐resolution crystal structure of the molybdate‐dependent transcriptional regulator (ModE) from Escherichia coli: a novel combination of domain folds , 1999, The EMBO journal.

[73]  A. Aggarwal,et al.  Structure of the multimodular endonuclease FokI bound to DNA , 1997, Nature.

[74]  V. Ramakrishnan,et al.  Recognition of Cognate Transfer RNA by the 30S Ribosomal Subunit , 2001, Science.

[75]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.