Toward more accurate prediction of caspase cleavage sites: a comprehensive review of current methods, tools and features

Abstract As one of the few irreversible protein posttranslational modifications, proteolytic cleavage is involved in nearly all aspects of cellular activities, ranging from gene regulation to cell life-cycle regulation. Among the various protease-specific types of proteolytic cleavage, cleavages by casapses/granzyme B are considered as essential in the initiation and execution of programmed cell death and inflammation processes. Although a number of substrates for both types of proteolytic cleavage have been experimentally identified, the complete repertoire of caspases and granzyme B substrates remains to be fully characterized. To tackle this issue and complement experimental efforts for substrate identification, systematic bioinformatics studies of known cleavage sites provide important insights into caspase/granzyme B substrate specificity, and facilitate the discovery of novel substrates. In this article, we review and benchmark 12 state-of-the-art sequence-based bioinformatics approaches and tools for caspases/granzyme B cleavage prediction. We evaluate and compare these methods in terms of their input/output, algorithms used, prediction performance, validation methods and software availability and utility. In addition, we construct independent data sets consisting of caspases/granzyme B substrates from different species and accordingly assess the predictive power of these different predictors for the identification of cleavage sites. We find that the prediction results are highly variable among different predictors. Furthermore, we experimentally validate the predictions of a case study by performing caspase cleavage assay. We anticipate that this comprehensive review and survey analysis will provide an insightful resource for biologists and bioinformaticians who are interested in using and/or developing tools for caspase/granzyme B cleavage prediction.

[1]  G M Cohen,et al.  Caspases: the executioners of apoptosis. , 1997, The Biochemical journal.

[2]  S. Kawabata,et al.  Proteolytic cascades and their involvement in invertebrate immunity. , 2010, Trends in biochemical sciences.

[3]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[4]  M. Michael Gromiha,et al.  Development of a Machine Learning Method to Predict Membrane Protein-Ligand Binding Residues Using Basic Sequence Information , 2015, Adv. Bioinformatics.

[5]  Sarah Boyd,et al.  PMAP: databases for analyzing proteolytic events and pathways , 2008, Nucleic Acids Res..

[6]  J. Tschopp,et al.  The inflammasome recognizes cytosolic microbial and host DNA and triggers an innate immune response , 2008, Nature.

[7]  James C. Whisstock,et al.  Pops: a Computational Tool for Modeling and Predicting Protease Specificity , 2004, J. Bioinform. Comput. Biol..

[8]  Geoffrey I. Webb,et al.  iProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites , 2018, Briefings Bioinform..

[9]  Tin Wee Tan,et al.  CASVM: web server for SVM-based prediction of caspase substrates cleavage sites , 2007, Bioinform..

[10]  Geoffrey I. Webb,et al.  Bioinformatic Approaches for Predicting substrates of Proteases , 2011, J. Bioinform. Comput. Biol..

[11]  Gholamreza Haffari,et al.  PROSPERous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy , 2018, Bioinform..

[12]  G. Salvesen,et al.  Caspase activation: the induced-proximity model. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Shai Shaham,et al.  The C. elegans cell death gene ced-3 encodes a protein similar to mammalian interleukin-1β-converting enzyme , 1993, Cell.

[14]  R. Gascoyne,et al.  Immunohistochemical analysis of in vivo patterns of expression of CPP32 (Caspase-3), a cell death protease. , 1997, Cancer research.

[15]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[16]  M. Cher,et al.  The role of proteolytic enzymes in metastatic bone disease , 2011 .

[17]  Gajendra P. S. Raghava,et al.  Open Access Research Article Prediction of Gtp Interacting Residues, Dipeptides and Tripeptides in a Protein from Its Evolutionary Information , 2022 .

[18]  Silvio C. E. Tosatto,et al.  The Pfam protein families database in 2019 , 2018, Nucleic Acids Res..

[19]  A. Anwar,et al.  Regulation of digestive proteolytic activity in the larvae of Spilosoma obliqua (Lep., Arctiidae) , 2001 .

[20]  Rolf Apweiler,et al.  Proteome Analysis Database: online application of InterPro and CluSTr for the functional classification of proteins in whole genomes , 2001, Nucleic Acids Res..

[21]  J. Tschopp,et al.  Innate Immune Activation Through Nalp3 Inflammasome Sensing of Asbestos and Silica , 2008, Science.

[22]  David Wallach,et al.  Involvement of MACH, a Novel MORT1/FADD-Interacting Protease, in Fas/APO-1- and TNF Receptor–Induced Cell Death , 1996, Cell.

[23]  Lawrence J. K. Wee,et al.  A multi-factor model for caspase degradome prediction , 2009, BMC Genomics.

[24]  Kris Gevaert,et al.  SitePredicting the cleavage of proteinase substrates. , 2009, Trends in biochemical sciences.

[25]  Ichigaku Takigawa,et al.  CaMPDB: a resource for calpain and modulatory proteolysis. , 2010, Genome informatics. International Conference on Genome Informatics.

[26]  Hiroshi Mamitsuka,et al.  A review of statistical methods for prediction of proteolytic cleavage , 2012, Briefings Bioinform..

[27]  S H Kaufmann,et al.  Mammalian caspases: structure, activation, substrates, and functions during apoptosis. , 1999, Annual review of biochemistry.

[28]  Geoffrey I. Webb,et al.  PROSPER: An Integrated Feature-Based Tool for Predicting Protease Substrate Cleavage Sites , 2012, PloS one.

[29]  Christina Backes,et al.  GraBCas: a bioinformatics tool for score-based prediction of Caspase- and Granzyme B-cleavage sites in protein sequences , 2005, Nucleic Acids Res..

[30]  S. Baksh,et al.  Apoptotic Cells Induce Migration of Phagocytes via Caspase-3-Mediated Release of a Lipid Attraction Signal , 2003, Cell.

[31]  J. Gerdes,et al.  Immunobiochemical and molecular biologic characterization of the cell proliferation-associated nuclear antigen that is defined by monoclonal antibody Ki-67. , 1991, The American journal of pathology.

[32]  D. Green,et al.  Cell death and tissue remodeling in planarian regeneration. , 2010, Developmental biology.

[33]  S. Rogers,et al.  Amino acid sequences common to rapidly degraded proteins: the PEST hypothesis. , 1986, Science.

[34]  J. Adams The proteasome: a suitable antineoplastic target , 2004, Nature Reviews Cancer.

[35]  Yu-Yen Ou,et al.  Bioinformatics approaches for functional annotation of membrane proteins , 2014, Briefings Bioinform..

[36]  G. McFadden,et al.  Apoptosis: an innate immune response to virus infection. , 1999, Trends in microbiology.

[37]  Z. Adam Protein stability and degradation in chloroplasts , 1996, Plant Molecular Biology.

[38]  Geoffrey I. Webb,et al.  Cascleave: towards more accurate prediction of caspase substrate cleavage sites , 2010, Bioinform..

[39]  Bernard F. Buxton,et al.  The DISOPRED server for the prediction of protein disorder , 2004, Bioinform..

[40]  N. Thornberry,et al.  Caspases: killer proteases. , 1997, Trends in biochemical sciences.

[41]  Yutaka Kuroda,et al.  DROP: an SVM domain linker predictor trained with optimal features selected by random forest , 2011, Bioinform..

[42]  Dong Xu,et al.  Computational Identification of Protein Methylation Sites through Bi-Profile Bayes Feature Extraction , 2009, PloS one.

[43]  C. Bortner,et al.  The role of DNA fragmentation in apoptosis. , 1995, Trends in cell biology.

[44]  Ursula Pieper,et al.  Prediction of protease substrates using sequence and structure features , 2010, Bioinform..

[45]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[46]  Lukasz A. Kurgan,et al.  Sequence-based prediction of protein crystallization, purification and production propensity , 2011, Bioinform..

[47]  James C. Whisstock,et al.  PoPS: a computational tool for modeling and predicting protease specificity , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[48]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[49]  Hashem Tamimi,et al.  Developing a powerful In Silico tool for the discovery of novel caspase-3 substrates: a preliminary screening of the human proteome , 2011, BMC Bioinformatics.

[50]  See-Kiong Ng,et al.  InterDom: a database of putative interacting protein domains for validating predicted protein interactions and complexes , 2003, Nucleic Acids Res..

[51]  M. Hochstrasser,et al.  A new protease required for cell-cycle progression in yeast , 1999, Nature.

[52]  Christoph Peters,et al.  Toward Computer-Based Cleavage Site Prediction of Cysteine Endopeptidases , 2003, Biological chemistry.

[53]  Liubin Feng,et al.  Crysalis: an integrated server for computational analysis and design of protein crystallization , 2016, Scientific Reports.

[54]  Tin Wee Tan,et al.  SVM-based prediction of caspase substrate cleavage sites , 2006, BMC Bioinformatics.

[55]  John M. Walker,et al.  C. elegans , 2006, Methods in Molecular Biology.

[56]  John Calvin Reed,et al.  Regulation of cell death protease caspase-9 by phosphorylation. , 1998, Science.

[57]  C. Overall,et al.  TopFIND, a knowledgebase linking protein termini with function , 2011, Nature Methods.

[58]  S. Rogers,et al.  PEST sequences and regulation by proteolysis. , 1996, Trends in biochemical sciences.

[59]  Gajendra P. S. Raghava,et al.  Pcleavage: an SVM based method for prediction of constitutive proteasome and immunoproteasome cleavage sites in antigenic sequences , 2005, Nucleic Acids Res..

[60]  Julia E. Seaman,et al.  The DegraBase: A Database of Proteolysis in Healthy and Apoptotic Human Cells* , 2012, Molecular & Cellular Proteomics.

[61]  R. Bleackley,et al.  Cytotoxic T lymphocytes: all roads lead to death , 2002, Nature Reviews Immunology.

[62]  Evan Bolton,et al.  Database resources of the National Center for Biotechnology Information , 2017, Nucleic Acids Res..

[63]  U. Jenal,et al.  An essential protease involved in bacterial cell‐cycle control , 1998, The EMBO journal.

[64]  C. Thompson,et al.  Pathways of Apoptosis in Lymphocyte Development, Homeostasis, and Disease , 2002, Cell.

[65]  V. Dixit,et al.  Death receptors: signaling and modulation. , 1998, Science.

[66]  Humberto Miguel Garay-Malpartida,et al.  CaSPredictor: a new computer-based tool for caspase substrate prediction , 2005, ISMB.

[67]  Minoru Kanehisa,et al.  AAindex: Amino Acid index database , 2000, Nucleic Acids Res..

[68]  Xing-Ming Zhao,et al.  Cascleave 2.0, a new approach for predicting caspase and granzyme cleavage targets , 2014, Bioinform..

[69]  G. Núñez,et al.  The inflammasome: a caspase-1-activation platform that regulates immune responses and disease pathogenesis , 2009, Nature Immunology.

[70]  Jeffrey W. Smith,et al.  CutDB: a proteolytic event database , 2006, Nucleic Acids Res..

[71]  Olli Nevalainen,et al.  Pripper: prediction of caspase cleavage sites from whole proteomes , 2010, BMC Bioinformatics.

[72]  M. Kirschner,et al.  Geminin, an Inhibitor of DNA Replication, Is Degraded during Mitosis , 1998, Cell.

[73]  N. Thornberry,et al.  A Combinatorial Approach Defines Specificities of Members of the Caspase Family and Granzyme B , 1997, The Journal of Biological Chemistry.

[74]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[75]  Alex Bateman,et al.  MEROPS: the database of proteolytic enzymes, their substrates and inhibitors , 2011, Nucleic Acids Res..

[76]  Helen Conroy,et al.  Caspase‐activation pathways in apoptosis and immunity , 2003, Immunological reviews.