Discrimination of acidic and alkaline enzyme using Chou's pseudo amino acid composition in conjunction with probabilistic neural network model.

Enzyme catalysis is one of the most essential and striking processes among of all the complex processes that have evolved in living organisms. Enzymes are biological catalysts, which play a significant role in industrial applications as well as in medical areas, due to profound specificity, selectivity and catalytic efficiency. Refining catalytic efficiency of enzymes has become the most challenging job of enzyme engineering, into acidic and alkaline. Discrimination of acidic and alkaline enzymes through experimental approaches is difficult, sometimes impossible due to lack of established structures. Therefore, it is highly desirable to develop a computational model for discriminating acidic and alkaline enzymes from primary sequences. In this study, we have developed a robust, accurate and high throughput computational model using two discrete sample representation methods Pseudo amino acid composition (PseAAC) and split amino acid composition. Various classification algorithms including probabilistic neural network (PNN), K-nearest neighbor, decision tree, multi-layer perceptron and support vector machine are applied to predict acidic and alkaline with high accuracy. 10-fold cross validation test and several statistical measures namely, accuracy, F-measure, and area under ROC are used to evaluate the performance of the proposed model. The performance of the model is examined using two benchmark datasets to demonstrate the effectiveness of the model. The empirical results show that the performance of PNN in conjunction with PseAAC is quite promising compared to existing approaches in the literature so for. It has achieved 96.3% accuracy on dataset1 and 99.2% on dataset2. It is ascertained that the proposed model might be useful for basic research and drug related application areas.

[1]  Wei Chen,et al.  iNuc-PseKNC: a sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition , 2014, Bioinform..

[2]  K. Chou,et al.  Hum-PLoc: a novel ensemble classifier for predicting human protein subcellular localization. , 2006, Biochemical and biophysical research communications.

[3]  M. A. Jordan,et al.  Acidophilic bacteria : Their potential mining and environmental applications , 1996 .

[4]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[5]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[6]  Manisha Sharma,et al.  Alkaliphilic bacteria: applications in industrial biotechnology , 2011, Journal of Industrial Microbiology & Biotechnology.

[7]  Nikhil U. Nair,et al.  Engineering of Enzymes for Selective Catalysis , 2010 .

[8]  Asifullah Khan,et al.  Prediction of membrane protein types by using dipeptide and pseudo amino acid composition-based composite features , 2012, IET Commun..

[9]  K. Chou,et al.  EzyPred: a top-down approach for predicting enzyme functional classes and subclasses. , 2007, Biochemical and biophysical research communications.

[10]  Yongchun Zuo,et al.  Predicting acidic and alkaline enzymes by incorporating the average chemical shift and gene ontology informations into the general form of Chou's PseAAC , 2013 .

[11]  Bin Li,et al.  CORN: Correlation-driven nonparametric learning approach for portfolio selection , 2011, TIST.

[12]  Kuo-Chen Chou,et al.  Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes , 2005, Bioinform..

[13]  Frank Rosenblatt,et al.  PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[14]  Xiaolong Wang,et al.  Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection , 2013, Bioinform..

[15]  Dong-Sheng Cao,et al.  propy: a tool to generate various modes of Chou's PseAAC , 2013, Bioinform..

[16]  Hui Ding,et al.  AcalPred: A Sequence-Based Tool for Discriminating between Acidic and Alkaline Enzymes , 2013, PloS one.

[17]  Pufeng Du,et al.  PseAAC-General: Fast Building Various Modes of General Form of Chou’s Pseudo-Amino Acid Composition for Large-Scale Protein Datasets , 2014, International journal of molecular sciences.

[18]  Maqsood Hayat,et al.  Discriminating protein structure classes by incorporating Pseudo Average Chemical Shift to Chou's general PseAAC and Support Vector Machine , 2014, Comput. Methods Programs Biomed..

[19]  K. Chou,et al.  PseKNC: a flexible web server for generating pseudo K-tuple nucleotide composition. , 2014, Analytical biochemistry.

[20]  N. Raja,et al.  ANN Approach for Weather Prediction using Back Propagation , 2012 .

[21]  Wei Chen,et al.  iNuc-PhysChem: A Sequence-Based Predictor for Identifying Nucleosomes via Physicochemical Properties , 2012, PloS one.

[22]  A. Papageorgiou,et al.  Enzyme adaptation to alkaline pH: Atomic resolution (1.08 Å) structure of phosphoserine aminotransferase from Bacillus alcalophilus , 2005, Protein science : a publication of the Protein Society.

[23]  K. Chou,et al.  iSNO-PseAAC: Predict Cysteine S-Nitrosylation Sites in Proteins by Incorporating Position Specific Amino Acid Propensity into Pseudo Amino Acid Composition , 2013, PloS one.

[24]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001 .

[25]  K. Chou,et al.  iCTX-Type: A Sequence-Based Predictor for Identifying the Types of Conotoxins in Targeting Ion Channels , 2014, BioMed research international.

[26]  Kuo-Chen Chou,et al.  iNR-Drug: Predicting the Interaction of Drugs with Nuclear Receptors in Cellular Networking , 2014, International journal of molecular sciences.

[27]  Guangya Zhang,et al.  Discriminating acidic and alkaline enzymes using a random forest model with secondary structure amino acid composition , 2009 .

[28]  K. Chou,et al.  Identification of Colorectal Cancer Related Genes with mRMR and Shortest Path in Protein-Protein Interaction Network , 2012, PloS one.

[29]  Elisabeth L. Humphris,et al.  Structural and mechanistic exploration of acid resistance: kinetic stability facilitates evolution of extremophilic behavior. , 2007, Journal of molecular biology.

[30]  K. Horikoshi,et al.  Analysis of the genome of an alkaliphilic Bacillus strain from an industrial point of view , 2000, Extremophiles.

[31]  B. Volkman,et al.  Protonation behavior of histidine 24 and histidine 119 in forming the pH 4 folding intermediate of apomyoglobin. , 1998, Biochemistry.

[32]  Wei Chen,et al.  iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition. , 2014, Analytical biochemistry.

[33]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[34]  K. Chou Some remarks on protein attribute prediction and pseudo amino acid composition , 2010, Journal of Theoretical Biology.

[35]  S. Khan,et al.  Prediction of protein structure classes using hybrid space of multi-profile Bayes and bi-gram probability feature spaces. , 2014, Journal of theoretical biology.

[36]  Kuo-Chen Chou,et al.  Screening for new agonists against Alzheimer's disease. , 2007, Medicinal chemistry (Shariqah (United Arab Emirates)).

[37]  Jacques Lapointe,et al.  Theoretical and experimental biology in one—A symposium in honour of Professor Kuo-Chen Chou’s 50th anniversary and Professor Richard Giegé’s 40th anniversary of their scientific careers , 2013 .

[38]  Mohammed Yeasin,et al.  Prediction of membrane proteins using split amino acid and ensemble classification , 2011, Amino Acids.

[39]  Kuo-Chen Chou,et al.  Molecular therapeutic target for type-2 diabetes. , 2004, Journal of proteome research.

[40]  Maqsood Hayat,et al.  Discriminating outer membrane proteins with Fuzzy K-nearest Neighbor algorithms based on the general form of Chou's PseAAC. , 2012, Protein and peptide letters.

[41]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[42]  K. Chou,et al.  iHyd-PseAAC: Predicting Hydroxyproline and Hydroxylysine in Proteins by Incorporating Dipeptide Position-Specific Propensity into Pseudo Amino Acid Composition , 2014, International journal of molecular sciences.

[43]  K. Chou,et al.  iSS-PseDNC: Identifying Splicing Sites Using Pseudo Dinucleotide Composition , 2014, BioMed research international.

[44]  Sheng-Xiang Lin,et al.  Theoretical and experimental biology in one — , 2013 .

[45]  Kuo-Chen Chou,et al.  Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-Nearest Neighbor classifiers. , 2006, Journal of proteome research.

[46]  Kuo-Chen Chou,et al.  Molecular modeling of two CYP2C19 SNPs and its implications for personalized drug design. , 2008, Protein and peptide letters.

[47]  K. Chou Structural bioinformatics and its impact to biomedical science. , 2004, Current medicinal chemistry.

[48]  Xin Wang,et al.  PseAAC-Builder: a cross-platform stand-alone program for generating various special Chou's pseudo-amino acid compositions. , 2012, Analytical biochemistry.

[49]  Zahoor Jan,et al.  Seasonal to Inter-annual Climate Prediction Using Data Mining KNN Technique , 2008, IMTIC.

[50]  T. Yamane,et al.  High-resolution crystal structure of M-protease: phylogeny aided analysis of the high-alkaline adaptation mechanism. , 1997, Protein engineering.

[51]  K. Chou,et al.  iRSpot-TNCPseAAC: Identify Recombination Spots with Trinucleotide Composition and Pseudo Amino Acid Components , 2014, International journal of molecular sciences.

[52]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[53]  A. Kim Hepatitis C Virus , 2016, Annals of Internal Medicine.

[54]  Kuo-Chen Chou,et al.  Hepatitis C Virus Network Based Classification of Hepatocellular Cirrhosis and Carcinoma , 2012, PloS one.

[55]  K. Chou,et al.  Cell-PLoc: a package of Web servers for predicting subcellular localization of proteins in various organisms , 2008, Nature Protocols.