Identification of a new family of putative PD-(D/E)XK nucleases with unusual phylogenomic distribution and a new type of the active site

BackgroundPrediction of structure and function for uncharacterized protein families by identification of evolutionary links to characterized families and known structures is one of the cornerstones of genomics. Theoretical assignment of three-dimensional folds and prediction of protein function even at a very general level can facilitate the experimental determination of the molecular mechanism of action and the role that members of a given protein family fulfill in the cell. Here, we predict the three-dimensional fold and study the phylogenomic distribution of members of a large family of uncharacterized proteins classified in the Clusters of Orthologous Groups database as COG4636.ResultsUsing protein fold-recognition we found that members of COG4636 are remotely related to Holliday junction resolvases and other nucleases from the PD-(D/E)XK superfamily. Structure modeling and sequence analyses suggest that most members of COG4636 exhibit a new, unusual variant of the putative active site, in which the catalytic Lys residue migrated in the sequence, but retained similar spatial position with respect to other functionally important residues. Sequence analyses revealed that members of COG4636 and their homologs are found mainly in Cyanobacteria, but also in other bacterial phyla. They undergo horizontal transfer and extensive proliferation in the colonized genomes; for instance in Gloeobacter violaceus PCC 7421 they comprise over 2% of all protein-encoding genes. Thus, members of COG4636 appear to be a new type of selfish genetic elements, which may fulfill an important role in the genome dynamics of Cyanobacteria and other species they invaded. Our analyses provide a platform for experimental determination of the molecular and cellular function of members of this large protein family.ConclusionAfter submission of this manuscript, a crystal structure of one of the COG4636 members was released in the Protein Data Bank (code 1wdj; Idaka, M., Wada, T., Murayama, K., Terada, T., Kuramitsu, S., Shirouzu, M., Yokoyama, S.: Crystal structure of Tt1808 from Thermus thermophilus Hb8, to be published). Our analysis of the Tt1808 structure reveals that we correctly predicted all functionally important features of the COG4636 family, including the membership in the PD-(D/E)xK superfamily of nucleases, the three-dimensional fold, the putative catalytic residues, and the unusual configuration of the active site.

[1]  J. Bujnicki,et al.  Specificity Changes in the Evolution of Type II Restriction Endonucleases , 2005, Journal of Biological Chemistry.

[2]  C. Trotta,et al.  Crystal structure and evolution of a transfer RNA splicing enzyme. , 1998, Science.

[3]  D. Baker,et al.  Coupled prediction of protein secondary and tertiary structure , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  V. Šikšnys,et al.  Alternative arrangements of catalytic residues at the active sites of restriction enzymes , 2002, FEBS letters.

[5]  M Ouali,et al.  Cascaded multiple classifiers for secondary structure prediction , 2000, Protein science : a publication of the Protein Society.

[6]  B. Matthews,et al.  Type II restriction endonucleases: structural, functional and evolutionary relationships. , 1999, Current opinion in chemical biology.

[7]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[8]  Jeffrey Chang,et al.  Biopython: Python tools for computational biology , 2000, SIGB.

[9]  G J Barton,et al.  Application of multiple sequence alignment profiles to improve protein secondary structure prediction , 2000, Proteins.

[10]  C. Lukacs,et al.  Understanding the immutability of restriction enzymes: crystal structure of BglII and its DNA substrate at 1.5 Å resolution , 2000, Nature Structural Biology.

[11]  K. Komori,et al.  X-ray and biochemical anatomy of an archaeal XPF/Rad1/Mus81 family nuclease: similarity between its endonuclease domain and restriction enzymes. , 2003, Structure.

[12]  J. Bujnicki,et al.  Grouping together highly diverged PD-(D/E)XK nucleases and identification of novel superfamily members using structure-guided alignment of sequence profiles. , 2001, Journal of molecular microbiology and biotechnology.

[13]  B. Matthews,et al.  Structural, functional, and evolutionary relationships between lambda-exonuclease and the type II restriction endonucleases. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[14]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[15]  Burkhard Rost,et al.  The PredictProtein server , 2003, Nucleic Acids Res..

[16]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[17]  M. Chandler,et al.  IS elements as constituents of bacterial genomes. , 1999, Research in microbiology.

[18]  Č. Venclovas,et al.  Five-stranded beta-sheet sandwiched with two alpha-helices: a structural link between restriction endonucleases EcoRI and EcoRV. , 1994, Proteins.

[19]  Hongyi Zhou,et al.  Single‐body residue‐level knowledge‐based energy score combined with sequence‐profile and secondary structure information for fold recognition , 2004, Proteins.

[20]  M. Belfort,et al.  Introns as mobile genetic elements. , 1993, Annual review of biochemistry.

[21]  D. Lilley,et al.  Crystal structure of the Holliday junction resolving enzyme T7 endonuclease I , 2001, Nature Structural Biology.

[22]  Janusz M Bujnicki,et al.  Crystallographic and bioinformatic studies on restriction endonucleases: inference of evolutionary relationships in the "midnight zone" of homology. , 2003, Current protein & peptide science.

[23]  David C. Jones,et al.  GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. , 1999, Journal of molecular biology.

[24]  E V Koonin,et al.  SURVEY AND SUMMARY: holliday junction resolvases and related nucleases: identification of new families, phyletic distribution and evolutionary trajectories. , 2000, Nucleic acids research.

[25]  Adam Godzik,et al.  Fold recognition methods. , 2005, Methods of biochemical analysis.

[26]  Č. Venclovas,et al.  Five‐stranded β‐sheet sandwiched with two α‐helices: A structural link between restriction endonucleases EcoRI and EcoRV , 1994 .

[27]  A. Muro-Pastor,et al.  Identification, genetic analysis and characterization of a sugar‐non‐specific nuclease from the cyanobacterium Anabaena sp. PCC 7120 , 1992, Molecular microbiology.

[28]  M. Mimuro,et al.  Complete genome structure of Gloeobacter violaceus PCC 7421, a cyanobacterium that lacks thylakoids. , 2003, DNA research : an international journal for rapid publication of reports on genes and genomes.

[29]  M. F. White,et al.  A Conserved Nuclease Domain in the Archaeal Holliday Junction Resolving Enzyme Hjc* , 2000, The Journal of Biological Chemistry.

[30]  N. Kunishima,et al.  Crystallographic and functional studies of very short patch repair endonuclease. , 1999, Molecular cell.

[31]  Janusz M. Bujnicki,et al.  GeneSilico protein structure prediction meta-server , 2003, Nucleic Acids Res..

[32]  Jonathan Casper,et al.  Combining local‐structure, fold‐recognition, and new fold methods for protein structure prediction , 2003, Proteins.

[33]  I Uchiyama,et al.  Comparison between Pyrococcus horikoshii and Pyrococcus abyssi genome sequences reveals linkage of restriction-modification genes with large genome polymorphisms. , 2000, Gene.

[34]  R J Roberts,et al.  Restriction endonucleases. , 1976, CRC critical reviews in biochemistry.

[35]  I. Simon,et al.  Protein stability indicates divergent evolution of PD‐(D/E)XK type II restriction endonucleases , 2002, Protein science : a publication of the Protein Society.

[36]  R. Blumenthal,et al.  Restriction Endonucleases: Structure of the Conserved Catalytic Core and the Role of Metal Ions in DNA Cleavage , 2004 .

[37]  F. Dyda,et al.  Unexpected structural diversity in DNA recombination: the restriction endonuclease connection. , 2000, Molecular cell.

[38]  D. Eisenberg,et al.  Assessment of protein models with three-dimensional profiles , 1992, Nature.

[39]  C. Ban,et al.  Structural basis for MutH activation in E.coli mismatch repair and relationship of MutH to restriction endonucleases , 1998, The EMBO journal.

[40]  Y. Nakayama,et al.  Experimental genome evolution: large‐scale genome rearrangements associated with resistance to replacement of a chromosomal restriction–modification gene complex , 2001, Molecular microbiology.

[41]  A. Jeltsch Maintenance of species identity and controlling speciation of bacteria: a new function for restriction/modification systems? , 2003, Gene.

[42]  D Fischer,et al.  Hybrid fold recognition: combining sequence derived properties with evolutionary information. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[43]  John B. Anderson,et al.  CDD: a curated Entrez database of conserved domain alignments , 2003, Nucleic Acids Res..

[44]  A. Aggarwal,et al.  Structure of restriction endonuclease BamHI and its relationship to EcoRI , 1994, Nature.

[45]  N. Mann,et al.  The Oceanic Cyanobacterial Picoplankton , 1994 .

[46]  D. Bryant The Molecular Biology of Cyanobacteria , 1994, Advances in Photosynthesis.

[47]  A. Aggarwal,et al.  Structure and function of restriction endonucleases. , 1995, Current opinion in structural biology.

[48]  I. Kobayashi,et al.  Multiplication of a restriction–modification gene complex , 2003, Molecular microbiology.

[49]  I. Kobayashi Behavior of restriction-modification systems as selfish mobile elements and their impact on genome evolution. , 2001, Nucleic acids research.

[50]  K. Komori,et al.  Crystal structure of the archaeal holliday junction resolvase Hjc and implications for DNA recognition. , 2001, Structure.

[51]  W N Hunter,et al.  Structure of Hjc, a Holliday junction resolvase, from Sulfolobus solfataricus , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[52]  Marcin Feder,et al.  A “FRankenstein's monster” approach to comparative modeling: Merging the finest fragments of Fold‐Recognition models and iterative model refinement aided by 3D structure evaluation , 2003, Proteins.

[53]  M. Asayama,et al.  Restriction barrier composed of an extracellular nuclease and restriction endonuclease in the unicellular cyanobacterium Microcystis sp. , 1996, FEMS microbiology letters.

[54]  B. Stoddard,et al.  Homing endonucleases: structure, function and evolution , 1999, Cellular and Molecular Life Sciences CMLS.

[55]  J. Bujnicki Molecular Phylogenetics of Restriction Endonucleases , 2004 .

[56]  V. Metelev,et al.  PspGI, a type II restriction endonuclease from the extreme thermophile Pyrococcus sp.: structural and functional studies to investigate an evolutionary relationship with several mesophilic restriction enzymes. , 2003, Journal of molecular biology.

[57]  H. Toh,et al.  Hjc resolvase is a distantly related member of the type II restriction endonuclease family. , 2000, Nucleic acids research.

[58]  A. Godzik,et al.  Comparison of sequence profiles. Strategies for structural predictions using sequence information , 2008, Protein science : a publication of the Protein Society.

[59]  J. Bujnicki,et al.  Unusual evolutionary history of the tRNA splicing endonuclease EndA: Relationship to the LAGLIDADG and PD‐(D/E)XK deoxyribonucleases , 2001, Protein science : a publication of the Protein Society.

[60]  M. F. White,et al.  Substrate recognition and catalysis by the Holliday junction resolving enzyme Hje. , 2004, Nucleic acids research.

[61]  J. Bujnicki,et al.  Identification of a PD-(D/E)XK-like domain with a novel configuration of the endonuclease active site in the methyl-directed restriction enzyme Mrr and its homologs. , 2001, Gene.

[62]  F. S. Gimble Invasion of a multitude of genetic niches by mobile endonuclease genes. , 2000, FEMS microbiology letters.

[63]  Fritz Eckstein,et al.  Nucleic acids and molecular biology , 1987 .

[64]  R. Huber,et al.  Structure-based redesign of the catalytic/metal binding site of Cfr10I restriction endonuclease reveals importance of spatial rather than sequence conservation of active centre residues. , 1998, Journal of molecular biology.

[65]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[66]  R. G. Lloyd,et al.  Holliday junction resolvases encoded by homologous rusA genes in Escherichia coli K-12 and phage 82. , 1996, Journal of molecular biology.

[67]  Aleksey A. Porollo,et al.  Accurate prediction of solvent accessibility using neural networks–based regression , 2004, Proteins.

[68]  O. White,et al.  Environmental Genome Shotgun Sequencing of the Sargasso Sea , 2004, Science.

[69]  P. Brick,et al.  Crystal structure of RPB5, a universal eukaryotic RNA polymerase subunit and transcription factor interaction target. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[70]  J. Elhai,et al.  DNA methyltransferases of the cyanobacterium Anabaena PCC 7120. , 2001, Nucleic acids research.

[71]  K. Sivonen,et al.  Site‐specific restriction endonucleases in cyanobacteria , 2000, Journal of applied microbiology.

[72]  J Lundström,et al.  Pcons: A neural‐network–based consensus predictor that improves fold recognition , 2001, Protein science : a publication of the Protein Society.

[73]  M. Sternberg,et al.  Enhanced genome annotation using structural profiles in the program 3D-PSSM. , 2000, Journal of molecular biology.

[74]  William R. Taylor,et al.  The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[75]  T L Blundell,et al.  FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. , 2001, Journal of molecular biology.

[76]  Janusz M. Bujnicki,et al.  COLORADO3D, a web server for the visual analysis of protein structures , 2004, Nucleic Acids Res..

[77]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[78]  R. Huber,et al.  Structure of the tetrameric restriction endonuclease NgoMIV in complex with cleaved DNA , 2000, Nature Structural Biology.

[79]  Liam J. McGuffin,et al.  The PSIPRED protein structure prediction server , 2000, Bioinform..

[80]  J. Bujnicki,et al.  Evolutionary Relationship between Different Subgroups of Restriction Endonucleases* , 2002, The Journal of Biological Chemistry.