Identification and analysis of a new family of bacterial serine proteinases

A family of hypothetical proteins, identified predominantly from archaeal genomes, has been analyzed in order to understand its functional characteristics. Using extensive sequence similarity searches it is inferred that this family is remotely related (best sequence identity is 19%) to ClpP proteinases that belongs to serine proteinase class. This family of hypothetical proteins is referred to as SDH proteinase family based on conserved sequential order of Ser, Asp and His residues and predicted serine proteinase activity. Results of fold recognition of SDH family sequences confirmed the remote relationship between SDH proteinases and Clp proteinases and revealed similar tertiary location of putative catalytic triad residues critical for serine proteinase function. However, the best sequence alignment we could obtain suggests that while catalytic Ser is conserved across Clp and SDH proteinases the location of the other catalytic triad residues, namely, His and Asp are swapped in their amino acid alignment positions and hence in 3-D structure. The evidence of conserved catalytic triad suggests that SDH could be a new family of serine proteinases with the fold of Clp proteinase, however sharing the catalytic triad order of carboxypeptidase clan. Signal peptide sequence identified at the N-terminus of some of the homologues suggests that these might be secretory serine proteinases involved in cleavage of extracellular proteins while the remote homologues, ClpP proteinases, are known to work in intracellular environment.

[1]  M. Sternberg,et al.  Enhanced genome annotation using structural profiles in the program 3D-PSSM. , 2000, Journal of molecular biology.

[2]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[3]  D. Matthews,et al.  Re-examination of the charge relay system in subtilisin comparison with other serine proteases. , 1977, The Journal of biological chemistry.

[4]  S. Arjunan,et al.  Prediction of Protein Secondary Structure , 2001 .

[5]  C. Ramakrishnan,et al.  Knowledge-based modeling of the serine protease triad into non-proteases. , 1999, Protein engineering.

[6]  T L Blundell,et al.  FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. , 2001, Journal of molecular biology.

[7]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[8]  Maciej Pietrzyk,et al.  Knowledge Based Modeling , 1999 .

[9]  S. Balaji,et al.  PALI - a database of Phylogeny and ALIgnment of homologous protein structures , 2001, Nucleic Acids Res..

[10]  Birgit Pils,et al.  Inactive enzyme-homologues find new function in regulatory processes. , 2004, Journal of molecular biology.

[11]  A. Kossiakoff,et al.  Direct determination of the protonation states of aspartic acid-102 and histidine-57 in the tetrahedral intermediate of the serine proteases: neutron structure of trypsin. , 1981, Biochemistry.

[12]  Alejandro A. Schäffer,et al.  IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices , 1999, Bioinform..

[13]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[14]  D. Blow Lipases reach the surface , 1991, Nature.

[15]  B. Rost PHD: predicting one-dimensional protein structure by profile-based neural networks. , 1996, Methods in enzymology.

[16]  L. Aravind,et al.  Plasmodium Biology Genomic Gleanings , 2003, Cell.

[17]  R. Huber,et al.  Functional significance of flexibility in proteins , 1982, Biopolymers.

[18]  B. Welch The structure , 1992 .

[19]  Shashi B. Pandit,et al.  SUPFAM - a database of potential protein superfamily relationships derived by comparing sequence-based and structure-based families: implications for structural genomics and function annotation in genomes , 2002, Nucleic Acids Res..

[20]  David C. Jones,et al.  GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. , 1999, Journal of molecular biology.

[21]  Robert D. Finn,et al.  The Pfam protein families database , 2004, Nucleic Acids Res..

[22]  S. Gottesman,et al.  Sequence and structure of Clp P, the proteolytic component of the ATP-dependent Clp protease of Escherichia coli. , 1990, The Journal of biological chemistry.

[23]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[24]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[25]  Tim J. P. Hubbard,et al.  SCOP: a structural classification of proteins database , 1998, Nucleic Acids Res..

[26]  S. Gottesman,et al.  Clp P represents a unique family of serine proteases. , 1990, The Journal of biological chemistry.

[27]  A. S. St John,et al.  Characterization of a membrane-associated serine protease in Escherichia coli , 1987, Journal of bacteriology.

[28]  Jimin Wang,et al.  The Structure of ClpP at 2.3 Å Resolution Suggests a Model for ATP-Dependent Proteolysis , 1997, Cell.

[29]  M. Suzuki,et al.  Molecular cloning and sequencing of the sppA gene and characterization of the encoded protease IV, a signal peptide peptidase, of Escherichia coli. , 1986, The Journal of biological chemistry.

[30]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[31]  S V Evans,et al.  SETOR: hardware-lighted three-dimensional solid model representations of macromolecules. , 1993, Journal of molecular graphics.

[32]  Michael Y. Galperin,et al.  The COG database: new developments in phylogenetic classification of proteins from complete genomes , 2001, Nucleic Acids Res..

[33]  S. Balaji,et al.  Integration of related sequences with protein three-dimensional structural families in an updated version of PALI database , 2003, Nucleic Acids Res..

[34]  Gail J. Bartlett,et al.  Catalysing new reactions during evolution: economy of residues and mechanism. , 2003, Journal of molecular biology.

[35]  Neil D. Rawlings,et al.  [2] Families of serine peptidases , 1994, Methods in Enzymology.