Modular organization of inteins and C‐terminal autocatalytic domains

Analysis of the conserved sequence features of inteins (protein “introns”) reveals that they are composed of three distinct modular domains. The N‐terminal (N) and C‐terminal (C) domains are predicted to perform different parts of the autocatalytic protein splicing reaction. An optional endonuclease domain (EN) is shown to correspond to different types of homing endonucleases in different inteins. The N domain contains motifs predicted to catalyze the first steps of protein splicing, leading to the cleavage of the intein N terminus from its protein host. Intein N domain motifs are also found in C‐terminal autocatalytic domains (CADs) present in hedgehog and other protein families. Specific residues in the N domain of intein and CADs are proposed to form a charge relay system involved in cleaving their N‐termini. The intein C domain is apparently unique to inteins and contains motifs that catalyze the final protein splicing steps: ligation of the intein flanks and cleavage of its C terminus to release the free intein and spliced host protein. All intein EN domains known thus far have dodecapeptide (DOD, LAGLI‐DADG) type homing endonuclease motifs. This work identifies an EN domain with an HNH homing‐endonuclease motif and two new small inteins with no EN domains. One of these small inteins might be inactive or a “pseudo intein.” The results suggest a modular architecture for inteins, clarify their origin and relationship to other protein families, and extend recent experimental findings on the functional roles of intein N, C, and EN motifs.

[1]  S. Pietrokovski,et al.  Conserved sequence features of inteins (protein introns) and their use in identifying new inteins and related proteins , 1994, Protein science : a publication of the Protein Society.

[2]  S. Pietrokovski Searching databases of conserved sequence regions by aligning protein multiple-alignments. , 1996, Nucleic acids research.

[3]  P. Beachy,et al.  Cholesterol Modification of Hedgehog Signaling Proteins in Animal Development , 1996, Science.

[4]  S. Henikoff,et al.  Automated construction and graphical presentation of protein blocks from unaligned sequences. , 1995, Gene.

[5]  J. Thorner,et al.  Homing of a DNA endonuclease gene by meiotic gene conversion in Saccharomyces cerevisiae , 1992, Nature.

[6]  T. Bürglin,et al.  Warthog and Groundhog, novel families related to Hedgehog , 1996, Current Biology.

[7]  S. Henikoff,et al.  Protein family classification based on searching a database of blocks. , 1994, Genomics.

[8]  F. Perler,et al.  The mechanism of protein splicing and its modulation by mutation. , 1996, The EMBO journal.

[9]  Andrew P. McMahon,et al.  The world according to bedgebog , 1997 .

[10]  B. Slatko,et al.  Intervening sequences in an Archaea DNA polymerase gene. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[11]  R. Hirata,et al.  Mutations at the putative junction sites of the yeast VMA1 protein, the catalytic subunit of the vacuolar membrane H(+)-ATPase, inhibit its processing by protein splicing. , 1992, Biochemical and biophysical research communications.

[12]  Y. Anraku,et al.  Identification of Three Core Regions Essential for Protein Splicing of the Yeast Vma1 Protozyme , 1997, The Journal of Biological Chemistry.

[13]  R. Hirata,et al.  Molecular structure of a gene, VMA1, encoding the catalytic subunit of H(+)-translocating adenosine triphosphatase from vacuolar membranes of Saccharomyces cerevisiae. , 1990, The Journal of biological chemistry.

[14]  Eugene V Koonin,et al.  Hedgehog Patterning Activity: Role of a Lipophilic Modification Mediated by the Carboxy-Terminal Autoprocessing Domain , 1996, Cell.

[15]  G. Dressler,et al.  Post-translational Processing and Renal Expression of Mouse Indian Hedgehog* , 1997, The Journal of Biological Chemistry.

[16]  I. Saira Mian,et al.  Statistic Modeling, Phylogenetic Analysis and Strjucture Prediction of a Protein Splicing Domain Common to Infeins and Hedgehog Proteins , 1997, J. Comput. Biol..

[17]  Neil D. Rawlings,et al.  [2] Families of serine peptidases , 1994, Methods in Enzymology.

[18]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[19]  R J Roberts,et al.  Homing endonucleases: keeping the house in order. , 1997, Nucleic acids research.

[20]  S. Cole,et al.  Homing events in the gyrA gene of some mycobacteria. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[21]  F. Quiocho,et al.  Crystal Structure of PI-SceI, a Homing Endonuclease with Protein Splicing Activity , 1997, Cell.

[22]  S. Pietrokovski,et al.  A new intein in cyanobacteria and its significance for the spread of inteins. , 1996, Trends in genetics : TIG.

[23]  S. Eddy,et al.  Amino acid sequence motif of group I intron endonucleases is conserved in open reading frames of group II introns. , 1994, Trends in biochemical sciences.

[24]  A. Telenti,et al.  The Mycobacterium xenopi GyrA protein splicing element: characterization of a minimal intein , 1997, Journal of bacteriology.

[25]  Xiang‐Qin Liu,et al.  Identification and characterization of a cyanobacterial DnaX intein , 1997, FEBS letters.

[26]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[27]  Frederick S. Gimble,et al.  Substitutions in Conserved Dodecapeptide Motifs That Uncouple the DNA Binding and DNA Cleavage Activities of PI-SceI Endonuclease (*) , 1995, The Journal of Biological Chemistry.

[28]  M. Belfort,et al.  Genetic definition of a protein-splicing domain: functional mini-inteins support structure predictions and a model for intein evolution. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[29]  G. Church,et al.  Complete genome sequence of Methanobacterium thermoautotrophicum deltaH: functional analysis and comparative genomics , 1997, Journal of bacteriology.

[30]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[31]  Joel L. Sussman,et al.  The α/β hydrolase fold , 1992 .

[32]  Ming-Qun Xu,et al.  Protein Splicing of the Saccharomyces cerevisiae VMA Intein without the Endonuclease Motifs* , 1997, The Journal of Biological Chemistry.

[33]  T. Stevens,et al.  Protein splicing of the yeast TFP1 intervening protein sequence: a model for self‐excision. , 1993, The EMBO journal.

[34]  M. Belfort,et al.  Mechanisms of Intron Mobility (*) , 1995, The Journal of Biological Chemistry.

[35]  E. Koonin,et al.  Crystal Structure of a Hedgehog Autoprocessing Domain: Homology between Hedgehog and Self-Splicing Proteins , 1997, Cell.

[36]  Y. Nakamura,et al.  Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions (supplement). , 1996, DNA research : an international journal for rapid publication of reports on genes and genomes.

[37]  C Cambillau,et al.  Cutinase, a lipolytic enzyme with a preformed oxyanion hole. , 1994, Biochemistry.

[38]  M. Colston,et al.  Evidence of selection for protein introns in the recAs of pathogenic mycobacteria. , 1994, The EMBO journal.

[39]  T. Attwood,et al.  PRINTS--a database of protein motif fingerprints. , 1994, Nucleic acids research.

[40]  F. Perler,et al.  Protein Splicing Involving the Saccharomyces cerevisiae VMA Intein , 1996, The Journal of Biological Chemistry.

[41]  A. Gorbalenya Self‐splicing group I and group II introns encode homologous (putative) DNA endonucleases of a new family , 1994, Protein science : a publication of the Protein Society.

[42]  F. Perler,et al.  Protein splicing removes intervening sequences in an archaea DNA polymerase. , 1992, Nucleic acids research.

[43]  F. Robb,et al.  Ribonucleotide reductase in the archaeon Pyrococcus furiosus: a critical enzyme in the evolution of DNA genomes? , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[44]  J. J. Lee,et al.  Autoproteolysis in hedgehog protein biogenesis. , 1994, Science.

[45]  E V Koonin,et al.  A protein splice-junction motif in hedgehog family proteins. , 1995, Trends in biochemical sciences.

[46]  G J Olsen,et al.  Compilation and analysis of intein sequences. , 1997, Nucleic acids research.

[47]  M. Belfort,et al.  Prokaryotic introns and inteins: a panoply of form and function , 1995, Journal of bacteriology.

[48]  G D Schuler,et al.  A workbench for multiple alignment construction and analysis , 1991, Proteins.

[49]  F Crick,et al.  Split genes and RNA splicing. , 1979, Science.