Efficient and Accurate Algorithm for Cleaved Fragments Prediction (CFPA) in Protein Sequences Dataset Based on Consensus and Its Variants: A Novel Degradomics Prediction Application.

Degradomics is a novel discipline that involves determination of the proteases/substrate fragmentation profile, called the substrate degradome, and has been recently applied in different disciplines. A major application of degradomics is its utility in the field of biomarkers where the breakdown products (BDPs) of different protease have been investigated. Among the major proteases assessed, calpain and caspase proteases have been associated with the execution phases of the pro-apoptotic and pro-necrotic cell death, generating caspase/calpain-specific cleaved fragments. The distinction between calpain and caspase protein fragments has been applied to distinguish injury mechanisms. Advanced proteomics technology has been used to identify these BDPs experimentally. However, it has been a challenge to identify these BDPs with high precision and efficiency, especially if we are targeting a number of proteins at one time. In this chapter, we present a novel bioinfromatic detection method that identifies BDPs accurately and efficiently with validation against experimental data. This method aims at predicting the consensus sequence occurrences and their variants in a large set of experimentally detected protein sequences based on state-of-the-art sequence matching and alignment algorithms. After detection, the method generates all the potential cleaved fragments by a specific protease. This space and time-efficient algorithm is flexible to handle the different orientations that the consensus sequence and the protein sequence can take before cleaving. It is O(mn) in space complexity and O(Nmn) in time complexity, with N number of protein sequences, m length of the consensus sequence, and n length of each protein sequence. Ultimately, this knowledge will subsequently feed into the development of a novel tool for researchers to detect diverse types of selected BDPs as putative disease markers, contributing to the diagnosis and treatment of related disorders.

[1]  C. Overall,et al.  Protease proteomics: revealing protease in vivo functions using systems biology approaches. , 2008, Molecular aspects of medicine.

[2]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[3]  C. López-Otín,et al.  Protease degradomics: A new challenge for proteomics , 2002, Nature Reviews Molecular Cell Biology.

[4]  V. Kähäri,et al.  Matrix metalloproteinases in cancer: Prognostic markers and therapeutic targets , 2002, International journal of cancer.

[5]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[6]  Thierry Lecroq Fast exact string matching algorithms , 2007, Inf. Process. Lett..

[7]  J. Morrow,et al.  Sequential Degradation of αII and βII Spectrin by Calpain in Glutamate or Maitotoxin-Stimulated Cells† , 2007 .

[8]  R. Hayes,et al.  Degradation of βII-Spectrin Protein by Calpain-2 and Caspase-3 Under Neurotoxic and Traumatic Brain Injury Conditions , 2015, Molecular Neurobiology.

[9]  Bilal H. Fadlallah,et al.  Bioinformatics approach to understanding interacting pathways in neuropsychiatric disorders. , 2014, Methods in molecular biology.

[10]  V. Kosma,et al.  Expression of Matrix Metalloproteinase (MMP)-2 and MMP-9 in Breast Cancer with a Special Reference to Activator Protein-2, HER2, and Prognosis , 2004, Clinical Cancer Research.

[11]  D. Lipman,et al.  Rapid and sensitive protein similarity searches. , 1985, Science.

[12]  Christopher M Overall,et al.  Updated biological roles for matrix metalloproteinases and new "intracellular" substrates revealed by degradomics. , 2009, Biochemistry.

[13]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[14]  C. Overall,et al.  Inflammation dampened by gelatinase A cleavage of monocyte chemoattractant protein-3. , 2000, Science.

[15]  J. Godovac-Zimmermann The 9th Siena Meeting: from Genome to Proteome: Open Innovations , 2012, Expert review of proteomics.

[16]  T L Blundell,et al.  Symmetry, stability, and dynamics of multidomain and multicomponent protein systems. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[17]  J. Mullikin,et al.  SSAHA: a fast search method for large DNA databases. , 2001, Genome research.

[18]  Christopher M Overall,et al.  N- and C-terminal degradomics: new approaches to reveal biological roles for plant proteases from substrate identification. , 2012, Physiologia plantarum.

[19]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[20]  Ronald J. Moore,et al.  Blood Peptidome-Degradome Profile of Breast Cancer , 2010, PloS one.

[21]  Bin Ma,et al.  PatternHunter: faster and more sensitive homology search , 2002, Bioinform..

[22]  Jing Liu,et al.  Psychiatric research: psychoproteomics, degradomics and systems biology , 2008, Expert review of proteomics.

[23]  TaeJin Ahn,et al.  A fast algorithm for exact sequence search in biological sequences using polyphase decomposition , 2010, Bioinform..

[24]  Y. Itoh,et al.  Matrix metalloproteinases in cancer. , 2002, Essays in biochemistry.

[25]  Donald E. Knuth,et al.  Fast Pattern Matching in Strings , 1977, SIAM J. Comput..

[26]  Christopher M. Overall,et al.  Degradomics: Systems biology of the protease web. Pleiotropic roles of MMPs in cancer , 2006, Cancer and Metastasis Reviews.

[27]  G. Opdenakker,et al.  Multidimensional degradomics identifies systemic autoantigens and intracellular matrix proteins as novel gelatinase B/MMP-9 substrates. , 2009, Integrative biology : quantitative biosciences from nano to macro.

[28]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[29]  R. Kammerer,et al.  Role of dimerization and substrate exclusion in the regulation of bone morphogenetic protein-1 and mammalian tolloid , 2009, Proceedings of the National Academy of Sciences.

[30]  B. Hogan,et al.  The mammalian Tolloid-like 1 gene, Tll1, is necessary for normal septation and positioning of the heart. , 1999, Development.

[31]  F. Tortella,et al.  Novel Differential Neuroproteomics Analysis of Traumatic Brain Injury in Rats* , 2006, Molecular & Cellular Proteomics.

[32]  J. Clements,et al.  Proteomic and other analyses to determine the functional consequences of deregulated kallikrein‐related peptidase (KLK) expression in prostate and ovarian cancer , 2014, Proteomics. Clinical applications.

[33]  M. Lindsey,et al.  Using proteomics to uncover extracellular matrix interactions during cardiac remodeling , 2013, Proteomics. Clinical applications.

[34]  Kevin K. W Wang,et al.  Calpain and caspase: can you tell the difference? , 2000, Trends in Neurosciences.