Sequence‐derived structural features driving proteolytic processing

Proteolytic signaling, or regulated proteolysis, is an essential part of many important pathways such as Notch, Wnt, and Hedgehog. How the structure of the cleaved substrate regions influences the efficacy of proteolytic processing remains underexplored. Here, we analyzed the relative importance in proteolysis of various structural features derived from substrate sequences using a dataset of more than 5000 experimentally verified proteolytic events captured in CutDB. Accessibility to the solvent was recognized as an essential property of a proteolytically processed polypeptide chain. Proteolytic events were found nearly uniformly distributed among three types of secondary structure, although with some enrichment in loops. Cleavages in α‐helices were found to be relatively abundant in regions apparently prone to unfolding, while cleavages in β‐structures tended to be located at the periphery of β‐sheets. Application of the same statistical procedures to proteolytic events divided into separate sets according to the catalytic classes of proteases proved consistency of the results and confirmed that the structural mechanisms of proteolysis are universal. The estimated prediction power of sequence‐derived structural features, which turned out to be sufficiently high, presents a rationale for their use in bioinformatic prediction of proteolytic events.

[1]  A. Berger,et al.  On the size of the active site in proteases. I. Papain. , 1967, Biochemical and biophysical research communications.

[2]  Lukasz A. Kurgan,et al.  Critical assessment of high-throughput standalone methods for secondary structure prediction , 2011, Briefings Bioinform..

[3]  Ying Zhang,et al.  Structural determinants of limited proteolysis. , 2011, Journal of proteome research.

[4]  Tin Wee Tan,et al.  CASVM: web server for SVM-based prediction of caspase substrates cleavage sites , 2007, Bioinform..

[5]  J M Thornton,et al.  Assessment of conformational parameters as predictors of limited proteolytic sites in native protein structures. , 1998, Protein engineering.

[6]  J M Thornton,et al.  Molecular recognition. Conformational analysis of limited proteolytic sites and serine proteinase protein inhibitors. , 1991, Journal of molecular biology.

[7]  X. Puente,et al.  Human and mouse proteases: a comparative genomic approach , 2003, Nature Reviews Genetics.

[8]  Humberto Miguel Garay-Malpartida,et al.  CaSPredictor: a new computer-based tool for caspase substrate prediction , 2005, ISMB.

[9]  Jeffrey W. Smith,et al.  CutDB: a proteolytic event database , 2006, Nucleic Acids Res..

[10]  Andreas Tholey,et al.  Mass spectrometry‐based proteomics strategies for protease cleavage site identification , 2012, Proteomics.

[11]  Gonzalo R. Ordóñez,et al.  The Degradome database: mammalian proteases and diseases of proteolysis , 2008, Nucleic Acids Res..

[12]  Ursula Pieper,et al.  Prediction of protease substrates using sequence and structure features , 2010, Bioinform..

[13]  Pierre Baldi,et al.  SCRATCH: a protein structure and structural feature prediction server , 2005, Nucleic Acids Res..

[14]  James C. Whisstock,et al.  PoPS: a computational tool for modeling and predicting protease specificity , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[15]  C. López-Otín,et al.  Protease degradomics: A new challenge for proteomics , 2002, Nature Reviews Molecular Cell Biology.

[16]  J M Thornton,et al.  Modeling studies of the change in conformation required for cleavage of limited proteolytic sites , 1994, Protein science : a publication of the Protein Society.

[17]  P. Kasperkiewicz,et al.  Current and prospective applications of non-proteinogenic amino acids in profiling of proteases substrate specificity , 2012, Biological chemistry.

[18]  David T. Barkan,et al.  Global Sequencing of Proteolytic Cleavage Sites in Apoptosis by Specific Labeling of Protein N Termini , 2008, Cell.

[19]  Lukasz A. Kurgan,et al.  SPINE X: Improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles , 2012, J. Comput. Chem..

[20]  S. Maurer-Stroh,et al.  Analysis of Protein Processing by N-terminal Proteomics Reveals Novel Species-specific Substrate Determinants of Granzyme B Orthologs *S , 2009, Molecular & Cellular Proteomics.

[21]  D. Turk,et al.  Protease signalling: the cutting edge , 2012, EMBO Journal.

[22]  P. Ascenzi,et al.  Proteolytic activity of bovine lactoferrin , 2004, Biometals.

[23]  S. Diamond Methods for mapping protease specificity. , 2007, Current opinion in chemical biology.

[24]  T. Gibson,et al.  Protein disorder prediction: implications for structural proteomics. , 2003, Structure.

[25]  Zsuzsanna Dosztányi,et al.  IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content , 2005, Bioinform..

[26]  Geoffrey I. Webb,et al.  Cascleave: towards more accurate prediction of caspase substrate cleavage sites , 2010, Bioinform..

[27]  G. Salvesen,et al.  Structural and kinetic determinants of protease substrates , 2009, Nature Structural &Molecular Biology.

[28]  B. Turk Targeting proteases: successes, failures and future prospects , 2006, Nature Reviews Drug Discovery.

[29]  B. Rost,et al.  Combining evolutionary information and neural networks to predict protein secondary structure , 1994, Proteins.

[30]  Aleksey A. Porollo,et al.  Combining prediction of secondary structure and solvent accessibility in proteins , 2005, Proteins.

[31]  J. S. Sodhi,et al.  Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. , 2004, Journal of molecular biology.

[32]  T. Petersen,et al.  A generic method for assignment of reliability scores applied to solvent accessibility predictions , 2009, BMC Structural Biology.

[33]  Neil D. Rawlings,et al.  MEROPS: the database of proteolytic enzymes, their substrates and inhibitors , 2013, Nucleic Acids Res..

[34]  Melanie Keller,et al.  Proteolytic Enzymes A Practical Approach , 2016 .

[35]  L. Cantley,et al.  Determination of protease cleavage site motifs using mixture-based oriented peptide libraries , 2001, Nature Biotechnology.

[36]  Christina Backes,et al.  GraBCas: a bioinformatics tool for score-based prediction of Caspase- and Granzyme B-cleavage sites in protein sequences , 2005, Nucleic Acids Res..

[37]  Liam J. McGuffin,et al.  The PSIPRED protein structure prediction server , 2000, Bioinform..