Pripper: prediction of caspase cleavage sites from whole proteomes

BackgroundCaspases are a family of proteases that have central functions in programmed cell death (apoptosis) and inflammation. Caspases mediate their effects through aspartate-specific cleavage of their target proteins, and at present almost 400 caspase substrates are known. There are several methods developed to predict caspase cleavage sites from individual proteins, but currently none of them can be used to predict caspase cleavage sites from multiple proteins or entire proteomes, or to use several classifiers in combination. The possibility to create a database from predicted caspase cleavage products for the whole genome could significantly aid in identifying novel caspase targets from tandem mass spectrometry based proteomic experiments.ResultsThree different pattern recognition classifiers were developed for predicting caspase cleavage sites from protein sequences. Evaluation of the classifiers with quality measures indicated that all of the three classifiers performed well in predicting caspase cleavage sites, and when combining different classifiers the accuracy increased further. A new tool, Pripper, was developed to utilize the classifiers and predict the caspase cut sites from an arbitrary number of input sequences. A database was constructed with the developed tool, and it was used to identify caspase target proteins from tandem mass spectrometry data from two different proteomic experiments. Both known caspase cleavage products as well as novel cleavage products were identified using the database demonstrating the usefulness of the tool. Pripper is not restricted to predicting only caspase cut sites, but it gives the possibility to scan protein sequences for any given motif(s) and predict cut sites once a suitable cut site prediction model for any other protease has been developed. Pripper is freely available and can be downloaded from http://users.utu.fi/mijopi/Pripper.ConclusionsWe have developed Pripper, a tool for reading an arbitrary number of proteins in FASTA format, predicting their caspase cleavage sites and outputting the cleaved sequences to a new FASTA format sequence file. We show that Pripper is a valuable tool in identifying novel caspase target proteins from modern proteomics experiments.

[1]  David T. Barkan,et al.  Global Sequencing of Proteolytic Cleavage Sites in Apoptosis by Specific Labeling of Protein N Termini , 2008, Cell.

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Lawrence J. K. Wee,et al.  A multi-factor model for caspase degradome prediction , 2009, BMC Genomics.

[4]  T. Nyman,et al.  Cytosolic RNA recognition pathway activates 14-3-3 protein mediated signaling and caspase-dependent disruption of cytokeratin network in human keratinocytes. , 2010, Journal of proteome research.

[5]  N. Thornberry,et al.  A Combinatorial Approach Defines Specificities of Members of the Caspase Family and Granzyme B , 1997, The Journal of Biological Chemistry.

[6]  Humberto Miguel Garay-Malpartida,et al.  CaSPredictor: a new computer-based tool for caspase substrate prediction , 2005, ISMB.

[7]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[8]  Xiao Sun,et al.  Prediction of DNA-binding residues in proteins from amino acid sequences using a random forest model with a hybrid feature , 2008, Bioinform..

[9]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[10]  Benjamin F. Cravatt,et al.  Global Mapping of the Topography and Magnitude of Proteolytic Events in Apoptosis , 2008, Cell.

[11]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[12]  S. Martin,et al.  The CASBAH: a searchable database of caspase substrates , 2007, Cell Death and Differentiation.

[13]  Christina Backes,et al.  GraBCas: a bioinformatics tool for score-based prediction of Caspase- and Granzyme B-cleavage sites in protein sequences , 2005, Nucleic Acids Res..

[14]  Christoph Peters,et al.  Toward Computer-Based Cleavage Site Prediction of Cysteine Endopeptidases , 2003, Biological chemistry.

[15]  B. Tharakan,et al.  Caspases - an update. , 2008, Comparative biochemistry and physiology. Part B, Biochemistry & molecular biology.

[16]  Tin Wee Tan,et al.  SVM-based prediction of caspase substrate cleavage sites , 2006, BMC Bioinformatics.

[17]  T. Fan,et al.  Caspase family proteases and apoptosis. , 2005, Acta biochimica et biophysica Sinica.

[18]  Geoffrey I. Webb,et al.  Cascleave: towards more accurate prediction of caspase substrate cleavage sites , 2010, Bioinform..

[19]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[20]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[21]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[22]  Zheng Rong Yang,et al.  Prediction of caspase cleavage sites using Bayesian bio-basis function neural networks , 2005, Bioinform..

[23]  P. Tompa Intrinsically unstructured proteins. , 2002, Trends in biochemical sciences.