DIRProt: a computational approach for discriminating insecticide resistant proteins from non-resistant proteins

BackgroundInsecticide resistance is a major challenge for the control program of insect pests in the fields of crop protection, human and animal health etc. Resistance to different insecticides is conferred by the proteins encoded from certain class of genes of the insects. To distinguish the insecticide resistant proteins from non-resistant proteins, no computational tool is available till date. Thus, development of such a computational tool will be helpful in predicting the insecticide resistant proteins, which can be targeted for developing appropriate insecticides.ResultsFive different sets of feature viz., amino acid composition (AAC), di-peptide composition (DPC), pseudo amino acid composition (PAAC), composition-transition-distribution (CTD) and auto-correlation function (ACF) were used to map the protein sequences into numeric feature vectors. The encoded numeric vectors were then used as input in support vector machine (SVM) for classification of insecticide resistant and non-resistant proteins. Higher accuracies were obtained under RBF kernel than that of other kernels. Further, accuracies were observed to be higher for DPC feature set as compared to others. The proposed approach achieved an overall accuracy of >90% in discriminating resistant from non-resistant proteins. Further, the two classes of resistant proteins i.e., detoxification-based and target-based were discriminated from non-resistant proteins with >95% accuracy. Besides, >95% accuracy was also observed for discrimination of proteins involved in detoxification- and target-based resistance mechanisms. The proposed approach not only outperformed Blastp, PSI-Blast and Delta-Blast algorithms, but also achieved >92% accuracy while assessed using an independent dataset of 75 insecticide resistant proteins.ConclusionsThis paper presents the first computational approach for discriminating the insecticide resistant proteins from non-resistant proteins. Based on the proposed approach, an online prediction server DIRProt has also been developed for computational prediction of insecticide resistant proteins, which is accessible at http://cabgrid.res.in:8080/dirprot/. The proposed approach is believed to supplement the efforts needed to develop dynamic insecticides in wet-lab by targeting the insecticide resistant proteins.

[1]  Kostas Iatrou,et al.  comprehensive molecular insect science , 2004 .

[2]  Thomas L. Madden,et al.  Domain enhanced lookup time accelerated BLAST , 2012, Biology Direct.

[3]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[4]  Hiroyuki Ogata,et al.  AAindex: Amino Acid Index Database , 1999, Nucleic Acids Res..

[5]  Hui Ding,et al.  Predicting ion channels and their types by the dipeptide mode of pseudo amino acid composition. , 2011, Journal of theoretical biology.

[6]  R. ffrench-Constant,et al.  A point mutation in a Drosophila GABA receptor confers insecticide resistance , 1993, Nature.

[7]  May R Berenbaum,et al.  Molecular mechanisms of metabolic resistance to synthetic and natural xenobiotics. , 2007, Annual review of entomology.

[8]  Cyril Zipfel,et al.  The Leucine-Rich Repeat Receptor-Like Kinase BRASSINOSTEROID INSENSITIVE1-ASSOCIATED KINASE1 and the Cytochrome P450 PHYTOALEXIN DEFICIENT3 Contribute to Innate Immunity to Aphids in Arabidopsis1[C][W][OPEN] , 2014, Plant Physiology.

[9]  R. Feyereisen,et al.  4.1 – Insect Cytochrome P450 , 2005 .

[10]  John C. Morgan,et al.  Identification and distribution of a GABA receptor mutation conferring dieldrin resistance in the malaria vector Anopheles funestus in Africa , 2011, Insect biochemistry and molecular biology.

[11]  Achuthsankar S. Nair,et al.  Composition, Transition and Distribution (CTD) — A dynamic feature for predictions based on hierarchical structure of cellular sorting , 2011, 2011 Annual IEEE India Conference.

[12]  C. Hogue,et al.  Armadillo: domain boundary prediction by amino acid composition. , 2005, Journal of molecular biology.

[13]  Tatsuya Akutsu,et al.  Subcellular location prediction of proteins using support vector machines with alignment of block sequences utilizing amino acid composition , 2007, BMC Bioinformatics.

[14]  Saman K. Halgamuge,et al.  Splice site identification using probabilistic parameters and SVM classification , 2006 .

[15]  Ying Gao,et al.  Bioinformatics Applications Note Sequence Analysis Cd-hit Suite: a Web Server for Clustering and Comparing Biological Sequences , 2022 .

[16]  Fangfei Lin,et al.  Expression patterns, mutation detection and RNA interference of Rhopalosiphum padi voltage-gated sodium channel genes , 2016, Scientific Reports.

[17]  Horacio Samaniego,et al.  Insecticide Resistance Mechanisms in the Green Peach Aphid Myzus persicae (Hemiptera: Aphididae) I: A Transcriptomic Survey , 2012, PloS one.

[18]  M. Williamson,et al.  Knockdown resistance to DDT and pyrethroids: from target-site mutations to molecular modelling. , 2008, Pest management science.

[19]  Karl Kornacker,et al.  RNA-Seq and molecular docking reveal multi-level pesticide resistance in the bed bug , 2012, BMC Genomics.

[20]  Fang Zhu,et al.  Co-up-regulation of three P450 genes in response to permethrin exposure in permethrin resistant house flies, Musca domestica , 2008, BMC Physiology.

[21]  R. ffrench-Constant,et al.  The Molecular Genetics of Insecticide Resistance , 2013, Genetics.

[22]  A. Martins,et al.  Insecticide Resistance and Fitness Cost , 2016 .

[23]  Ernest Hodgson,et al.  Insect Cytochrome P450 , 1991 .

[24]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Shinji Kasai,et al.  Overexpression of cytochrome P450 genes in pyrethroid-resistant Culex quinquefasciatus. , 2010, Insect biochemistry and molecular biology.

[26]  F. Simard,et al.  Kdr-based insecticide resistance in Anopheles gambiae s.s populations in , 2011, BMC Research Notes.

[27]  Fang Zhu,et al.  Differential expression of CYP6A5 and CYP6A5v2 in pyrethroid-resistant house flies, Musca domestica. , 2008, Archives of insect biochemistry and physiology.

[28]  Wei Dou,et al.  Mining Genes Involved in Insecticide Resistance of Liposcelis bostrychophila Badonnel by Transcriptome and Expression Profile Analysis , 2013, PloS one.

[29]  Chris H. Q. Ding,et al.  Multi-class protein fold recognition using support vector machines and neural networks , 2001, Bioinform..

[30]  J. Oakeshott,et al.  The genomics of insecticide resistance , 2003, Genome Biology.

[31]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001 .

[32]  Fei Li,et al.  Mutations in acetylcholinesterase associated with insecticide resistance in the cotton aphid, Aphis gossypii Glover. , 2004, Insect biochemistry and molecular biology.

[33]  I. Muchnik,et al.  Prediction of protein folding class using global description of amino acid sequence. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[34]  H. Rochat,et al.  Role of lysine and tryptophan residues in the biological activity of toxin VII (Ts gamma) from the scorpion Tityus serrulatus. , 1999, European journal of biochemistry.

[35]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[36]  Janet Hemingway,et al.  The molecular basis of insecticide resistance in mosquitoes. , 2004, Insect biochemistry and molecular biology.

[37]  Nannan Liu,et al.  Pyrethroid Resistance in Insects: Genes, Mechanisms, and Regulation , 2012 .

[38]  Kuo-Chen Chou,et al.  iNR-PhysChem: A Sequence-Based Predictor for Identifying Nuclear Receptors and Their Subfamilies via Physical-Chemical Property Matrix , 2012, PloS one.

[39]  J. McAllister,et al.  Insecticide resistance and vector control. , 1998, Journal of agromedicine.

[40]  Kuo-Chen Chou,et al.  Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes , 2005, Bioinform..

[41]  K. Chou,et al.  Support vector machines for predicting membrane protein types by using functional domain composition. , 2003, Biophysical journal.

[42]  J Hemingway,et al.  Glutathione S-transferases as antioxidant defence agents confer pyrethroid resistance in Nilaparvata lugens. , 2001, The Biochemical journal.

[43]  Ralf Nauen,et al.  IRAC: Mode of action classification and insecticide resistance management. , 2015, Pesticide biochemistry and physiology.

[44]  Wei Chen,et al.  iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition , 2013, Nucleic acids research.

[45]  Colin Loftin,et al.  Spatial Autocorrelation Models for Galton's Problem , 1981 .

[46]  Dongsup Kim,et al.  Prediction of protein secondary structure content using amino acid composition and evolutionary information , 2005, Proteins.

[47]  Janet Hemingway,et al.  A single mutation in the GSTe2 gene allows tracking of metabolically based insecticide resistance in a major malaria vector , 2014, Genome Biology.

[48]  Wen-Jer Wu,et al.  Discovery of Genes Related to Insecticide Resistance in Bactrocera dorsalis by Functional Genomic Analysis of a De Novo Assembled Transcriptome , 2012, PloS one.

[49]  Steven Salzberg,et al.  Finding Genes in DNA with a Hidden Markov Model , 1997, J. Comput. Biol..

[50]  Y. Pelletier,et al.  Environmental stresses induce the expression of putative glycine‐rich insect cuticular protein genes in adult Leptinotarsa decemlineata (Say) , 2008, Insect molecular biology.

[51]  K. Chou,et al.  Using Functional Domain Composition and Support Vector Machines for Prediction of Protein Subcellular Location* , 2002, The Journal of Biological Chemistry.

[52]  Wei Chen,et al.  Prediction of midbody, centrosome and kinetochore proteins based on gene ontology information. , 2010, Biochemical and biophysical research communications.

[53]  A. Mutero,et al.  Resistance-associated point mutations in insecticide-insensitive acetylcholinesterase. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[54]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[55]  Huizhu Yuan,et al.  De novo transcriptome and expression profile analyses of the Asian corn borer (Ostrinia furnacalis) reveals relevant flubendiamide response genes , 2017, BMC Genomics.

[56]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[57]  Ian Denholm,et al.  Knockdown resistance (kdr) to DDT and pyrethroid insecticides maps to a sodium channel gene locus in the housefly (Musca domestica) , 1993, Molecular and General Genetics MGG.

[58]  Asifullah Khan,et al.  Predicting membrane protein types by fusing composite protein sequence features into pseudo amino acid composition. , 2011, Journal of theoretical biology.

[59]  X. Chen,et al.  SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence , 2003, Nucleic Acids Res..

[60]  Ian Denholm,et al.  Delayed cuticular penetration and enhanced metabolism of deltamethrin in pyrethroid-resistant strains of Helicoverpa armigera from China and Pakistan. , 2006, Pest management science.

[61]  Nai-Yang Deng,et al.  Prediction of enzyme subfamily class via pseudo amino acid composition by incorporating the conjoint triad feature. , 2010, Protein and peptide letters.

[62]  Yujie Cai,et al.  The influence of dipeptide composition on protein thermostability , 2004, FEBS letters.

[63]  Masahiro Miyazaki,et al.  Cloning and sequencing of the para-type sodium channel gene from susceptible and kdr-resistant German cockroaches (Blattella germanica) and house fly (Musca domestica) , 1996, Molecular and General Genetics MGG.

[64]  F C Kafatos,et al.  Gene expression in insecticide resistant and susceptible Anopheles gambiae strains constitutively or after insecticide exposure , 2005, Insect molecular biology.

[65]  David Weetman,et al.  Does kdr genotype predict insecticide-resistance phenotype in mosquitoes? , 2009, Trends in parasitology.

[66]  Robert Edwards,et al.  Glutathione Transferases , 2010, The arabidopsis book.

[67]  R. ffrench-Constant,et al.  Cyclodiene insecticide resistance: from molecular to population genetics. , 2000, Annual review of entomology.

[68]  Kuo-Chen Chou,et al.  Predicting membrane protein type by functional domain composition and pseudo-amino acid composition. , 2006, Journal of theoretical biology.

[69]  R. ffrench-Constant,et al.  The genetics and genomics of insecticide resistance. , 2004, Trends in genetics : TIG.

[70]  Ting Li,et al.  Multiple Cytochrome P450 Genes: Their Constitutive Overexpression and Permethrin Induction in Insecticide Resistant Mosquitoes, Culex quinquefasciatus , 2011, PloS one.

[71]  John G. Oakeshott,et al.  Caboxylesterases in the metabolism and toxicity of pesticides , 2011 .

[72]  Oliver Kohlbacher,et al.  MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition , 2006, Bioinform..

[73]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[74]  K. Chou Pseudo Amino Acid Composition and its Applications in Bioinformatics, Proteomics and System Biology , 2009 .

[75]  J. Hemingway,et al.  5.11 – Glutathione Transferases , 2005 .