Machine Learning Classifiers based on Predicting Membrane Protein using Decision Tree and Random Forest

Bioinformatics protein data was used for drug design, finding diseases, a gene for a living organism, immune relation, etc. Analysis of protein data is hard to see its full range functions and fulfill the biological queries. Protein primary information are available in amino acid sequence datasets. Therefore, based on the primary biological data, it is highly sensible to construct an efficient predictor for the identification of protein function and their types. In this study, imbalance and excessive datasets are often managed well by machine learning classifiers. In this study, machine learning classifiers build on the features of membrane cell sequence and some of their essential functions depend on PseAAC (Pseudo Amino Acid Composition) descriptors. In this study proposed Prediction model based on decision tree classifiers such as Information gain, Gini-Index, and Random Forest and performance analyzed based on their resultant accuracy among the classifiers, the random forest was performed well, were the accuracy of 91.67%.

[1]  Lukasz Kurgan,et al.  Amino Acid Sequence Based Method for Prediction of Cell Membrane Protein Types , 2008 .

[2]  Kuo-Chen Chou,et al.  MemType-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM. , 2007, Biochemical and biophysical research communications.

[3]  Mohammed Yeasin,et al.  Prediction of membrane proteins using split amino acid and ensemble classification , 2011, Amino Acids.

[4]  Ren Long,et al.  iDHS-EL: identifying DNase I hypersensitive sites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework , 2016, Bioinform..

[5]  Bin Liu,et al.  Pse-in-One 2.0: An Improved Package of Web Servers for Generating Various Modes of Pseudo Components of DNA, RNA, and Protein Sequences , 2017 .

[6]  Kuo-Bin Li,et al.  Predicting membrane protein types by incorporating protein topology, domains, signal peptides, and physicochemical properties into the general form of Chou's pseudo amino acid composition. , 2013, Journal of theoretical biology.

[7]  Maqsood Hayat,et al.  Author ' s Accepted Manuscript Classification of membrane protein types using Voting feature interval in combination with Chou ' s pseudo amino acid composition , 2015 .

[8]  Lior Rokach,et al.  An Introduction to Decision Trees , 2007 .

[9]  Kuo-Chen Chou,et al.  Prediction of Membrane Protein Types by Incorporating Amphipathic Effects , 2005, J. Chem. Inf. Model..

[10]  Marco Punta,et al.  Membrane protein prediction methods. , 2007, Methods.

[11]  Ren Long,et al.  iEnhancer-2L: a two-layer predictor for identifying enhancers and their strength by pseudo k-tuple nucleotide composition , 2016, Bioinform..

[12]  Geoffrey I. Webb,et al.  iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences , 2018, Bioinform..

[13]  Kuo-Chen Chou Insights from modeling three-dimensional structures of the human potassium and sodium channels. , 2004, Journal of proteome research.

[14]  H.-B. Shen,et al.  Using ensemble classifier to identify membrane protein types , 2006, Amino Acids.

[15]  S. Rigatti Random Forest. , 2017, Journal of insurance medicine.

[16]  E Siva Sankari,et al.  Predicting membrane protein types using various decision tree classifiers based on various modes of general PseAAC for imbalanced datasets. , 2017, Journal of theoretical biology.

[17]  Xuhui Chen,et al.  The prediction of membrane protein types with NPE , 2010, IEICE Electron. Express.

[18]  Yasser M. K. Omar,et al.  Predicting Drug Target Interaction by Integrating Drug Fingerprint and Drug Side Effect Using Machine Learning , 2019, AMLTA.

[19]  K. Chou,et al.  Prediction of membrane protein types and subcellular locations , 1999, Proteins.

[20]  S. Harrison,et al.  Mitochondrial uncoupling protein 2 structure determined by NMR molecular fragment searching , 2011, Nature.

[21]  Lukasz A. Kurgan,et al.  Classification of Cell Membrane Proteins , 2007, 2007 Frontiers in the Convergence of Bioscience and Information Technologies.

[22]  Maqsood Hayat,et al.  Mem-PHybrid: hybrid features-based prediction system for classifying membrane protein types. , 2012, Analytical biochemistry.

[23]  Jia He,et al.  Improving discrimination of outer membrane proteins by fusing different forms of pseudo amino acid composition. , 2010, Analytical biochemistry.