Prediction of neuropeptide cleavage sites in insects

MOTIVATION The production of neuropeptides from their precursor proteins is the result of a complex series of enzymatic processing steps. Often, the annotation of new neuropeptide genes from sequence information outstrips biochemical assays and so bioinformatics tools can provide rapid information on the most likely peptides produced by a gene. Predicting the final bioactive neuropeptides from precursor proteins requires accurate algorithms to determine which locations in the protein are cleaved. RESULTS Predictive models were trained on Apis mellifera and Drosophila melanogaster precursors using binary logistic regression, multi-layer perceptron and k-nearest neighbor models. The final predictive models included specific amino acids at locations relative to the cleavage sites. Correct classification rates ranged from 78 to 100% indicating that the models adequately predicted cleaved and non-cleaved positions across a wide range of neuropeptide families and insect species. The model trained on D.melanogaster data had better generalization properties than the model trained on A. mellifera for the data sets considered. The reliable and consistent performance of the models in the test data sets suggests that the bioinformatics strategies proposed here can accurately predict neuropeptides in insects with sequence information based on neuropeptides with biochemical and sequence information in well-studied species.

[1]  J. Sweedler,et al.  Neuropeptide precursors in Tribolium castaneum , 2007, Peptides.

[2]  Loris Nanni,et al.  Ensemblator: An ensemble of classifiers for reliable classification of biological data , 2007, Pattern Recognit. Lett..

[3]  Olivier Gascuel,et al.  Identification of novel peptide hormones in the human proteome by hidden Markov model screening. , 2007, Genome research.

[4]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2006, Nucleic Acids Research.

[5]  Liliane Schoofs,et al.  From the Genome to the Proteome: Uncovering Peptides in the Apis Brain , 2006, Science.

[6]  Ying Wang,et al.  Insights into social insects from the genome of the honeybee Apis mellifera , 2006, Nature.

[7]  V. Hook Unique neuronal functions of cathepsin L and cathepsin B in secretory vesicles: biosynthesis of peptides in neurotransmission and neurodegenerative disease , 2006, Biological chemistry.

[8]  Bruce R. Southey,et al.  NeuroPred: a tool to predict cleavage sites in neuropeptide precursors and provide the masses of the resulting peptides , 2006, Nucleic Acids Res..

[9]  Jonathan V Sweedler,et al.  Bridging neuropeptidomics and genomics with bioinformatics: Prediction of mammalian neuropeptide prohormone processing. , 2006, Journal of proteome research.

[10]  Bruce R. Southey,et al.  Prediction of neuropeptide prohormone cleavages with application to RFamides , 2006, Peptides.

[11]  R. Predel,et al.  Direct mass spectrometric peptide profiling and fragmentation of larval peptide hormone release sites in Drosophila melanogaster reveals tagma‐specific peptide expression and differential processing , 2006, Journal of neurochemistry.

[12]  Mark R. Brown,et al.  Structural studies of Drosophila short neuropeptide F: Occurrence and receptor binding activity , 2006, Peptides.

[13]  F. Liu,et al.  In Silico Identification of New Secretory Peptide Genes in Drosophila melanogaster*S , 2006, Molecular & Cellular Proteomics.

[14]  Jonathan V Sweedler,et al.  Discovering new invertebrate neuropeptides using mass spectrometry. , 2006, Mass spectrometry reviews.

[15]  The Honeybee Genome Sequencing Consortium,et al.  Erratum: Insights into social insects from the genome of the honeybee Apis mellifera , 2006, Nature.

[16]  Liliane Schoofs,et al.  Peptidomic analysis of the larval Drosophila melanogaster central nervous system by two-dimensional capillary liquid chromatography quadrupole time-of-flight mass spectrometry. , 2005, Journal of mass spectrometry : JMS.

[17]  J. Franklin,et al.  The elements of statistical learning: data mining, inference and prediction , 2005 .

[18]  D. Wicher,et al.  Unique accumulation of neuropeptides in an insect: FMRFamide‐related peptides in the cockroach, Periplaneta americana , 2004, The European journal of neuroscience.

[19]  S. Brunak,et al.  Improved prediction of signal peptides: SignalP 3.0. , 2004, Journal of molecular biology.

[20]  Christian Wegener,et al.  Peptidomics of CNS‐associated neurohemal systems of adult Drosophila melanogaster: A mass spectrometric survey of peptides from individual flies , 2004, The Journal of comparative neurology.

[21]  Nikolaj Blom,et al.  Prediction of proprotein convertase cleavage sites. , 2004, Protein engineering, design & selection : PEDS.

[22]  Mattias Ohlsson,et al.  WeAidU - a decision support system for myocardial perfusion images using artificial neural networks , 2004, Artif. Intell. Medicine.

[23]  Klaudiusz R Weiss,et al.  From precursor to final peptides: a statistical sequence-based approach to predicting prohormone processing. , 2003, Journal of proteome research.

[24]  Gregory A Petsko,et al.  2.4 A resolution crystal structure of the prototypical hormone-processing protease Kex2 in complex with an Ala-Lys-Arg boronic acid inhibitor. , 2003, Biochemistry.

[25]  Robert Huber,et al.  The crystal structure of the proprotein processing proteinase furin explains its stringent specificity , 2003, Nature Structural Biology.

[26]  T. Komiyama,et al.  Precursor processing by kex2/furin proteases. , 2002, Chemical reviews.

[27]  L. Schoofs,et al.  Peptidomics of the Larval Drosophila melanogasterCentral Nervous System* , 2002, The Journal of Biological Chemistry.

[28]  A. Prat,et al.  Precursor convertases in the secretory pathway, cytosol and extracellular milieu. , 2002, Essays in biochemistry.

[29]  D. Nässel Neuropeptides in the nervous system of Drosophila and other insects: multiple roles as neuromodulators and neurohormones , 2002, Progress in Neurobiology.

[30]  I. Lindberg,et al.  11 – The Enzymology of PC1 and PC2 , 2002 .

[31]  S. Schultz Principles of Neural Science, 4th ed. , 2001 .

[32]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[33]  J. Veenstra,et al.  Mono- and dibasic proteolytic cleavage sites in insect neuroendocrine peptide precursors. , 2000, Archives of insect biochemistry and physiology.

[34]  H. Kataoka,et al.  Isolation and identification of the cDNA encoding the pheromone biosynthesis activating neuropeptide and additional neuropeptides in the oriental tobacco budworm, Helicoverpa assulta (Lepidoptera: Noctuidae). , 1998, Insect biochemistry and molecular biology.

[35]  C. Gadenne,et al.  The pheromone biosynthesis activating neuropeptide (PBAN) of the black cutworm moth, Agrotis ipsilon: immunohistochemistry, molecular characterization and bioassay of its peptide sequence. , 1998, Insect biochemistry and molecular biology.

[36]  E. Jacquin-Joly,et al.  cDNA cloning and sequence determination of the pheromone biosynthesis activating neuropeptide of Mamestra brassicae: a new member of the PBAN family. , 1998, Insect biochemistry and molecular biology.

[37]  Robert I. Damper,et al.  On neural-network implementations of k-nearest neighbor pattern classifiers , 1997 .

[38]  SchumacherMartin,et al.  Neural networks and logistic regression: Part II , 1996 .

[39]  W. Vach,et al.  Neural networks and logistic regression: Part I , 1996 .

[40]  Paul Cohen,et al.  Role of amino acid sequences flanking dibasic cleavage sites in precursor proteolytic processing The importance of the first residue C-terminal of the cleavage site , 1995 .

[41]  C Fahy,et al.  Role of amino acid sequences flanking dibasic cleavage sites in precursor proteolytic processing. The importance of the first residue C-terminal of the cleavage site. , 1995, European journal of biochemistry.

[42]  W. Roelofs,et al.  Structural organization of the Helicoverpa zea gene encoding the precursor protein for pheromone biosynthesis-activating neuropeptide and other neuropeptides. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[43]  M. Ikeda,et al.  Precursor polyprotein for multiple neuropeptides secreted from the suboesophageal ganglion of the silkworm Bombyx mori: characterization of the cDNA encoding the diapause hormone precursor and identification of additional peptides. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[44]  L Devi,et al.  Consensus sequence for processing of peptide precursors at monobasic sites , 1991, FEBS letters.

[45]  A. Agresti An introduction to categorical data analysis , 1997 .

[46]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[47]  E Harper,et al.  On the size of the active site in proteases: pronase. , 1972, Biochemical and biophysical research communications.