Self-organizing mapping based swarm intelligence for secondary and tertiary proteins classification

Proteins have a significant role in animals and human health. Interactions among proteins are complex and large. Proteins separations are challenging process in molecular biology. Computational tools help to simulate the analysis in order to reduce the training data into small testing data. Large proteins have been mapped using self-organizing maps (SOMs). Neural network based SOMs has a significant role in reducing the irregular shapes of proteins interactions. Iterative checking enables the organizations of all proteins. In next stage, particle swarm intelligence is applied to classify the proteins’ families. In the current work, secondary (Two dimensional) and tertiary proteins (Three dimensional) proteins have been grouped. Two dimensional proteins contain fewer hydro-carbons than three dimensional proteins. For faster analysis, the angles of the proteins are taken into account. The SOMs is compared with Bounding Box approach. In final, the experimental evolutions show that swarm intelligence achieved faster processing through enabling less memory consumptions and time. Since PSO combines proteins datasets in fuzzy values, the compactness or integration of similar proteins are strong. On the other hand, Bounding Box uses the Crisp value. Therefore, it needs more space to organize the whole data. Without SOMs, swarm intelligence also results are poor due to the excessive time consuming and required storage area. Moreover, for almost all classification and clustering tools, it is observed that the overall classification task becomes slow, time consuming, space consuming and also less sensitive because of noises, irrelevant data in input datasets. Thus, the proposed SOM based PSO approach achieved less time consuming with efficient classification into secondary and tertiary proteins.

[1]  George Karypis,et al.  A Comprehensive Survey of Neighborhood-based Recommendation Methods , 2011, Recommender Systems Handbook.

[2]  Dan Wang,et al.  Rule-based back propagation neural networks for various precision rough set presented KANSEI knowledge prediction: a case study on shoe product form features extraction , 2017, Neural Computing and Applications.

[3]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[4]  Sonia Farhana Nimmy,et al.  DGPPIsAS :A Dynamic Global PPIs Alignment System , 2015 .

[5]  Christophe Lefevre,et al.  BacHbpred: Support Vector Machine Methods for the Prediction of Bacterial Hemoglobin-Like Proteins , 2016, Adv. Bioinformatics.

[6]  Chun Chen,et al.  Mapping Users across Networks by Manifold Alignment on Hypergraph , 2014, AAAI.

[7]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[8]  Shuang Li,et al.  SVM-Prot 2016: A Web-Server for Machine Learning Prediction of Protein Functional Families from Sequence Irrespective of Similarity , 2016, PloS one.

[9]  Nilanjan Dey,et al.  Dengue Fever Classification Using Gene Expression Data: A PSO Based Artificial Neural Network Approach , 2016, FICTA.

[10]  Nurit Haspel,et al.  Accurate refinement of docked protein complexes using evolutionary information and deep learning , 2016, J. Bioinform. Comput. Biol..

[11]  Engelbert Mephu Nguifo,et al.  Protein sequences classification by means of feature extraction with substitution matrices , 2010, BMC Bioinformatics.

[12]  Jun Zhang,et al.  Multilayer Ensemble Pruning via Novel Multi-sub-swarm Particle Swarm Optimization , 2009, J. Univers. Comput. Sci..

[13]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[14]  Gunasekaran Manogaran,et al.  Disease Surveillance System for Big Climate Data Processing and Dengue Transmission , 2017, Int. J. Ambient Comput. Intell..

[15]  Francisco Herrera,et al.  INFFC: An iterative class noise filter based on the fusion of classifiers with noise sensitivity control , 2016, Inf. Fusion.

[16]  Sher Afzal Khan,et al.  A Prediction Model for Membrane Proteins Using Moments Based Features , 2016, BioMed research international.

[17]  Johan Bollen,et al.  A Principal Component Analysis of 39 Scientific Impact Measures , 2009, PloS one.

[18]  Patricia Rodriguez-Tomé,et al.  The European Bioinformatics Institute (EBI) databases , 1994, Nucleic Acids Res..

[19]  M. Fahimi,et al.  Genetic algorithm trained counter-propagation neural net in structural optimization , 2001 .

[20]  Witold Pedrycz,et al.  A Study on Relationship Between Generalization Abilities and Fuzziness of Base Classifiers in Ensemble Learning , 2015, IEEE Transactions on Fuzzy Systems.

[21]  Amira S. Ashour,et al.  Neural-based prediction of structural failure of multistoried RC buildings , 2016 .

[22]  M. Narasimha Murty,et al.  Structural Neighborhood Based Classification of Nodes in a Network , 2016, KDD.

[23]  Sonia Farhana Nimmy,et al.  Next generation sequencing under de novo genome assembly , 2015 .

[24]  K Henrick,et al.  Electronic Reprint Biological Crystallography Secondary-structure Matching (ssm), a New Tool for Fast Protein Structure Alignment in Three Dimensions Biological Crystallography Secondary-structure Matching (ssm), a New Tool for Fast Protein Structure Alignment in Three Dimensions , 2022 .

[25]  C. L. Wu,et al.  Methods to improve neural network performance in daily flows prediction , 2009 .

[26]  Alberto Muñoz,et al.  Self-organizing maps for outlier detection , 1998, Neurocomputing.

[27]  Mohammad Ibrahim Khan,et al.  Performance evaluation of Warshall algorithm and dynamic programming for Markov chain in local sequence alignment , 2013, Interdisciplinary Sciences: Computational Life Sciences.

[28]  K. Chau,et al.  A hybrid model coupled with singular spectrum analysis for daily rainfall prediction , 2010 .

[29]  Jen-Tzung Chien,et al.  Bayesian Recurrent Neural Network for Language Modeling , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Dongdong Sun,et al.  A novel network-based computational method to predict protein phosphorylation on tyrosine sites , 2015, J. Bioinform. Comput. Biol..

[31]  Russell C. Eberhart,et al.  Recent advances in particle swarm , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[32]  Richard Bonneau,et al.  Robust classification of protein variation using structural modelling and large-scale data integration , 2015, bioRxiv.

[33]  Sun-Yuan Hsieh,et al.  A Faster cDNA Microarray Gene Expression Data Classifier for Diagnosing Diseases , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[34]  Hideaki Sugawara,et al.  DNA Data Bank of Japan (DDBJ) in collaboration with mass sequencing teams , 2000, Nucleic Acids Res..

[35]  Astha Baxi,et al.  A Review on Otsu Image Segmentation Algorithm , 2013 .

[36]  José M. Alonso,et al.  A Survey of Fuzzy Systems Software: Taxonomy, Current Research Trends, and Prospects , 2016, IEEE Transactions on Fuzzy Systems.

[37]  Michael J. Watts,et al.  IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Publication Information , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Jamil Ahmad,et al.  A forward only counter propagation network-based approach for contraceptive method choice classification task , 2012, J. Exp. Theor. Artif. Intell..

[39]  Mohammad Ibrahim Khan,et al.  An integrated algorithm for local sequence alignment , 2014, Network Modeling Analysis in Health Informatics and Bioinformatics.

[40]  Dan Wang,et al.  Image feature-based affective retrieval employing improved parameter and structure identification of adaptive neuro-fuzzy inference system , 2018, Neural Computing and Applications.

[41]  Yu-Dong Cai,et al.  Support Vector Machines for predicting protein structural class , 2001, BMC Bioinformatics.

[42]  Bülent Yener,et al.  Prediction of Growth Factor-Dependent Cleft Formation During Branching Morphogenesis Using A Dynamic Graph-Based Growth Model , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[43]  M. Seetha,et al.  Kernel Locality Preserving Symmetrical Weighted Fisher Discriminant Analysis based subspace approach for expression recognition , 2016 .

[44]  Bohdan Schneider,et al.  A Biocurator Perspective: Annotation at the Research Collaboratory for Structural Bioinformatics Protein Data Bank , 2006, PLoS Comput. Biol..

[45]  Melody Y. Kiang,et al.  Extending the Kohonen self-organizing map networks for clustering analysis , 2002 .

[46]  José-Fernán Martínez,et al.  An Improved Otsu Threshold Segmentation Method for Underwater Simultaneous Localization and Mapping-Based Navigation , 2016, Sensors.

[47]  J. Thornton,et al.  Predicting protein function from sequence and structural data. , 2005, Current opinion in structural biology.

[48]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[49]  B. Rost,et al.  Automatic prediction of protein function , 2003, Cellular and Molecular Life Sciences CMLS.

[50]  Chuan Wang,et al.  DescFold: A web server for protein fold recognition , 2009, BMC Bioinformatics.

[51]  Qinghai Bai,et al.  Analysis of Particle Swarm Optimization Algorithm , 2010, Comput. Inf. Sci..

[52]  D T Jones,et al.  A systematic comparison of protein structure classifications: SCOP, CATH and FSSP. , 1999, Structure.

[53]  Sébastien Destercke,et al.  An extension of the FURIA classification algorithm to low quality data through fuzzy rankings and its application to the early diagnosis of dyslexia , 2016, Neurocomputing.

[54]  De-Shuang Huang,et al.  Predicting Hub Genes Associated with Cervical Cancer through Gene Co-Expression Networks , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[55]  George Karypis,et al.  Protein Structure Prediction using String Kernels , 2006 .

[56]  Francisco Herrera,et al.  Evaluating the classifier behavior with noisy data considering performance and robustness: The Equalized Loss of Accuracy measure , 2016, Neurocomputing.

[57]  Ana Tereza Ribeiro de Vasconcelos,et al.  Structural descriptor database: a new tool for sequence-based functional site prediction , 2008, BMC Bioinformatics.

[58]  Francisco Herrera,et al.  NICGAR: A Niching Genetic Algorithm to mine a diverse set of interesting quantitative association rules , 2016, Inf. Sci..

[59]  Manish Kumar Gupta,et al.  Binding affinity analysis and ADMET prediction of epigallocatechine gallate (EGCG) derivatives for AP-1 protein: a drug target for liver cancer , 2014, Network Modeling Analysis in Health Informatics and Bioinformatics.

[60]  Michael Schmuker,et al.  SOMMER: self-organising maps for education and research , 2006, Journal of molecular modeling.

[61]  Kwok-wing Chau,et al.  Improving Forecasting Accuracy of Annual Runoff Time Series Using ARIMA Based on EEMD Decomposition , 2015, Water Resources Management.

[62]  R. Brereton,et al.  Support vector machines for classification and regression. , 2010, The Analyst.

[63]  Eric P. Nawrocki,et al.  NCBI prokaryotic genome annotation pipeline , 2016, Nucleic acids research.

[64]  Pei Jiang Chen,et al.  Study on Otsu Threshold Method for Image Segmentation Based on Genetic Algorithm , 2014 .

[65]  Jan Faigl,et al.  An Application of Self-Organizing Map for Multirobot Multigoal Path Planning with Minmax Objective , 2016, Comput. Intell. Neurosci..

[66]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[67]  Nagasuma R. Chandra,et al.  Comparison of protein structures by growing neighborhood alignments , 2007, BMC Bioinformatics.

[68]  Roberto Palmieri,et al.  Automated Data Partitioning for Highly Scalable and Strongly Consistent Transactions , 2016, IEEE Trans. Parallel Distributed Syst..

[69]  Yu-Lin He,et al.  Particle swarm optimization for determining fuzzy measures from data , 2011, Inf. Sci..

[70]  Vandana,et al.  Survey of Nearest Neighbor Techniques , 2010, ArXiv.

[71]  José Manuel Benítez,et al.  On the stopping criteria for k-Nearest Neighbor in positive unlabeled time series classification problems , 2016, Inf. Sci..

[72]  Shanwen Zhang,et al.  Dimension Reduction Using Semi-Supervised Locally Linear Embedding for Plant Leaf Classification , 2009, ICIC.

[73]  Jitendra Virmani,et al.  A Decision Support System for Classification of Normal and Medical Renal Disease Using Ultrasound Images: A Decision Support System for Medical Renal Diseases , 2017, Int. J. Ambient Comput. Intell..

[74]  Patel Mayuri Dinubhai,et al.  Comparative Study of Multi-class Protein Structure Prediction Using Advanced Soft computing Techniques , 2013 .

[75]  Jingyu Hou,et al.  Explore the hidden treasure in protein-protein interaction networks - An iterative model for predicting protein functions , 2015, J. Bioinform. Comput. Biol..

[76]  Kwok-wing Chau,et al.  Data-driven input variable selection for rainfall-runoff modeling using binary-coded particle swarm optimization and Extreme Learning Machines , 2015 .

[77]  Elisabeth Coudert,et al.  HAMAP in 2013, new developments in the protein family classification and annotation system , 2012, Nucleic Acids Res..

[78]  Siti Mariyam Shamsuddin,et al.  Particle Swarm Optimization: Technique, System and Challenges , 2011 .