Identification of DNA-binding proteins using structural, electrostatic and evolutionary features.

[1]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[2]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[3]  R. Sauer,et al.  Protein-DNA recognition. , 1984, Annual review of biochemistry.

[4]  A. D. McLachlan,et al.  Profile analysis: detection of distantly related proteins. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[5]  P. V. von Hippel,et al.  Facilitated Target Location in Biological Systems* , 2022 .

[6]  P. Ross,et al.  Thermodynamics of Cro protein-DNA interactions. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[7]  R. Roberts,et al.  Crystal structure of the HhaI DNA methyltransferase complexed with S-adenosyl-L-methionine. , 1993, Cell.

[8]  B. Rost,et al.  Improved prediction of protein secondary structure by use of sequence profiles and neural networks. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Richard J. Roberts,et al.  Crystal structure of the Hhal DNA methyltransferase complexed with S-adenosyl-l-methionine , 1993, Cell.

[10]  Frederick P. Brooks,et al.  Fast analytical computation of Richard's smooth molecular surface , 1993, Proceedings Visualization '93.

[11]  R. Roberts,et al.  Hhal methyltransferase flips its target base out of the DNA helix , 1994, Cell.

[12]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[13]  R. Roberts An amazing distortion in DNA induced by a methyltransferase. , 1994, Bioscience reports.

[14]  J. Thornton,et al.  Satisfying hydrogen bonding potential in proteins. , 1994, Journal of molecular biology.

[15]  B. Honig,et al.  Classical electrostatics in biology and chemistry. , 1995, Science.

[16]  H. Margalit,et al.  Comprehensive analysis of hydrogen bonds in regulatory protein DNA-complexes: in search of common principles. , 1995, Journal of molecular biology.

[17]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[18]  R. Roberts,et al.  Enzymatic C5-cytosine methylation of DNA: mechanistic implications of new crystal structures for HhaL methyltransferase-DNA-AdoHcy complexes. , 1996, Journal of molecular biology.

[19]  Chris Sander,et al.  The HSSP database of protein structure-sequence alignments , 1993, Nucleic Acids Res..

[20]  Gapped BLAST and PSI-BLAST: A new , 1997 .

[21]  J. Thornton,et al.  NUCPLOT: a program to generate schematic diagrams of protein-nucleic acid interactions. , 1997, Nucleic acids research.

[22]  Barry L. Stoddard,et al.  DNA binding and cleavage by the nuclear intron-encoded homing endonuclease I-PpoI , 1998, Nature.

[23]  J. Thornton,et al.  PQS: a protein quaternary structure file server. , 1998, Trends in biochemical sciences.

[24]  Alexander D. MacKerell,et al.  All-atom empirical potential for molecular modeling and dynamics studies of proteins. , 1998, The journal of physical chemistry. B.

[25]  H M Berman,et al.  Protein-DNA interactions: A structural analysis. , 1999, Journal of molecular biology.

[26]  M F Sanner,et al.  Python: a programming language for software integration and development. , 1999, Journal of molecular graphics & modelling.

[27]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[28]  K Nadassy,et al.  Structural features of protein-nucleic acid recognition sites. , 1999, Biochemistry.

[29]  Liisa Holm,et al.  DaliLite workbench for protein structure comparison , 2000, Bioinform..

[30]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[31]  J M Berger,et al.  Structure and function of Cdc6/Cdc18: implications for origin recognition and checkpoint control. , 2000, Molecular cell.

[32]  C. Lukacs,et al.  Understanding the immutability of restriction enzymes: crystal structure of BglII and its DNA substrate at 1.5 Å resolution , 2000, Nature Structural Biology.

[33]  C. Pabo,et al.  Geometric analysis and comparison of protein-DNA interfaces: why is there no simple code for recognition? , 2000, Journal of molecular biology.

[34]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[35]  B Honig,et al.  An integrated approach to the analysis and modeling of protein sequences and structures. I. Protein structural alignment and a quantitative measure for protein structural distance. , 2000, Journal of molecular biology.

[36]  J. Thornton,et al.  An overview of the structures of protein-DNA complexes , 2000, Genome Biology.

[37]  Nathan A. Baker,et al.  Electrostatics of nanosystems: Application to microtubules and the ribosome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[38]  H. Dyson,et al.  Coupling of folding and binding for unstructured proteins. , 2002, Current opinion in structural biology.

[39]  O. Lichtarge,et al.  Structural clusters of evolutionary trace residues are statistically significant and common in proteins. , 2002, Journal of molecular biology.

[40]  Janet M Thornton,et al.  Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity. , 2002, Journal of molecular biology.

[41]  Janet M Thornton,et al.  Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins. , 2003, Nucleic acids research.

[42]  Yael Mandel-Gutfreund,et al.  Annotating nucleic acid-binding function based on protein structure. , 2003, Journal of molecular biology.

[43]  Nathan A. Baker,et al.  PDB2PQR: an automated pipeline for the setup of Poisson-Boltzmann electrostatics calculations , 2004, Nucleic Acids Res..

[44]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[45]  Shandar Ahmad,et al.  Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information , 2004, Bioinform..

[46]  Kengo Kinoshita,et al.  Structure‐based prediction of DNA‐binding sites on proteins Using the empirical preference of electrostatic potential and the shape of molecular surfaces , 2004, Proteins.

[47]  Satya Prakash,et al.  Replication by human DNA polymerase-ι occurs by Hoogsteen base-pairing , 2004, Nature.

[48]  Akinori Sarai,et al.  Moment-based prediction of DNA-binding proteins. , 2004, Journal of molecular biology.

[49]  A. Pingoud,et al.  Type II restriction endonucleases: structure and mechanism , 2005, Cellular and Molecular Life Sciences.

[50]  Satya Prakash,et al.  Replication by human DNA polymerase-iota occurs by Hoogsteen base-pairing. , 2004, Nature.

[51]  K Henrick,et al.  Electronic Reprint Biological Crystallography Secondary-structure Matching (ssm), a New Tool for Fast Protein Structure Alignment in Three Dimensions Biological Crystallography Secondary-structure Matching (ssm), a New Tool for Fast Protein Structure Alignment in Three Dimensions , 2022 .

[52]  Janet M Thornton,et al.  Identifying DNA-binding proteins using structural motifs and the electrostatic potential. , 2004, Nucleic acids research.

[53]  N. Ben-Tal,et al.  Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior. , 2004, Molecular biology and evolution.

[54]  L. Mirny,et al.  Kinetics of protein-DNA interaction: facilitated target location in sequence-dependent potential. , 2004, Biophysical journal.

[55]  Gert Lubec,et al.  Searching for hypothetical proteins: Theory and practice based upon original data and literature , 2005, Progress in Neurobiology.

[56]  M. Moorhouse,et al.  The Protein Databank , 2005 .

[57]  Tal Pupko,et al.  In silico identification of functional regions in proteins , 2005, ISMB.

[58]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2005, Nucleic Acids Res..

[59]  Guoli Wang,et al.  PISCES: recent improvements to a PDB sequence culling server , 2005, Nucleic Acids Res..

[60]  J. Reeve,et al.  Archaeal chromatin proteins: different structures but common function? , 2005, Current opinion in microbiology.

[61]  D. Lejeune,et al.  Protein–nucleic acid recognition: Statistical analysis of atomic interactions and influence of DNA structure , 2005, Proteins.

[62]  Janet M. Thornton,et al.  ProFunc: a server for predicting protein function from 3D structure , 2005, Nucleic Acids Res..

[63]  N. Bhardwaj,et al.  Kernel-based machine learning protocol for predicting DNA-binding proteins , 2005, Nucleic acids research.

[64]  Itay Mayrose,et al.  ConSurf 2005: the projection of evolutionary conservation scores of residues on protein structures , 2005, Nucleic Acids Res..

[65]  Janet M. Thornton,et al.  HTHquery: a method for detecting DNA-binding proteins with a helix-turn-helix structural motif , 2005, Bioinform..

[66]  Janet M Thornton,et al.  Protein function prediction using local 3D templates. , 2005, Journal of molecular biology.

[67]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[68]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[69]  Jeffrey Skolnick,et al.  Efficient prediction of nucleic acid binding function from low-resolution protein structures. , 2006, Journal of molecular biology.

[70]  Conrad C. Huang,et al.  Tools for integrated sequence-structure analysis with UCSF Chimera , 2006, BMC Bioinformatics.

[71]  Iddo Friedberg,et al.  Automated protein function predictionçthe genomic challenge , 2006 .

[72]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[73]  A. Grosberg,et al.  How proteins search for their specific sites on DNA: the role of DNA conformation. , 2006, Biophysical journal.

[74]  A. Kolstø,et al.  A new protein superfamily includes two novel 3-methyladenine DNA glycosylases from Bacillus cereus, AlkC and AlkD , 2006, Molecular microbiology.

[75]  J. Berger,et al.  Replication Origin Recognition and Deformation by a Heterodimeric Archaeal Orc1 Complex , 2007, Science.

[76]  Burkhard Rost,et al.  Prediction of DNA-binding residues from sequence , 2007, ISMB/ECCB.

[77]  T. Rognes,et al.  Structural insight into repair of alkylated DNA by a new superfamily of DNA glycosylases comprising HEAT-like repeats , 2007, Nucleic acids research.

[78]  Robert D. Finn,et al.  New developments in the InterPro database , 2007, Nucleic Acids Res..

[79]  A. Isaksson,et al.  Cross-validation and bootstrapping are unreliable in small sample classification , 2008, Pattern Recognit. Lett..

[80]  Yael Mandel-Gutfreund,et al.  Classifying RNA-Binding Proteins Based on Electrostatic Properties , 2008, PLoS Comput. Biol..

[81]  Nir Ben-Tal,et al.  Detection of functionally important regions in "hypothetical proteins" of known structure. , 2008, Structure.