An Extensive Repot on Cellular Automata Based Artificial Immune System for Strengthening Automated Protein Prediction

Artificial Immune System (AIS-MACA) a novel computational intelligence technique is can be used for strengthening the automated protein prediction system with more adaptability and incorporating more parallelism to the system. Most of the existing approaches are sequential which will classify the input into four major classes and these are designed for similar sequences. AIS-MACA is designed to identify ten classes from the sequences that share twilight zone similarity and identity with the training sequences with mixed and hybrid variations. This method also predicts three states (helix, strand, and coil) for the secondary structure. Our comprehensive design considers 10 feature selection methods and 4 classifiers to develop MACA (Multiple Attractor Cellular Automata) based classifiers that are build for each of the ten classes. We have tested the proposed classifier with twilight-zone and 1-high-similarity benchmark datasets with over three dozens of modern competing predictors shows that AIS-MACA provides the best overall accuracy that ranges between 80% and 89.8% depending on the dataset.

[1]  R Abagyan,et al.  Homology modeling with internal coordinate mechanics: Deformation zone mapping and improvements of models via conformational search , 1997, Proteins.

[2]  Pokkuluri Kiran Sree,et al.  Identification of Protein Coding Regions in Genomic DNA Using Unsupervised FMACA Based Pattern Classifier , 2014, ArXiv.

[3]  P. Y. Chou,et al.  Prediction of the secondary structure of proteins from their amino acid sequence. , 2006 .

[4]  Roland L. Dunbrack,et al.  Comparative modeling of CASP3 targets using PSI‐BLAST and SCWRL , 1999, Proteins.

[5]  E. Snyder,et al.  Identification of coding regions in genomic DNA sequences: an application of dynamic programming and neural networks. , 1993, Nucleic acids research.

[6]  Pokkuluri Kiran Sree,et al.  PSMACA: An Automated Protein Structure Prediction Using MACA (Multiple Attractor Cellular Automata) , 2013, ArXiv.

[7]  Parimal Pal Chaudhuri,et al.  FMACA: A Fuzzy Cellular Automata Based Pattern Classifier , 2004, DASFAA.

[8]  Victor V. Solovyev,et al.  Effect of Secondary Structure Prediction on Protein Fold Recognition and Database Search , 1996 .

[9]  Richard Bonneau,et al.  Rosetta in CASP4: Progress in ab initio protein structure prediction , 2001, Proteins.

[10]  E. Snyder,et al.  Identification of protein coding regions in genomic DNA. , 1995, Journal of molecular biology.

[11]  Pokkuluri Kiran Sree,et al.  Improving Quality of Clustering using Cellular Automata for Information retrieval , 2014, ArXiv.

[12]  Richard P. Lippmann,et al.  An introduction to computing with neural nets , 1987 .

[13]  Richard Hughey,et al.  Hidden Markov models for detecting remote protein homologies , 1998, Bioinform..

[14]  D. Mitra,et al.  Digital Signal Processing in Protein Secondary Structure Prediction , 2004 .

[15]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[16]  Giovanni Soda,et al.  Bidirectional Dynamics for Protein Secondary Structure Prediction , 2001, Sequence Learning.

[17]  R Thiele,et al.  Protein threading by recursive dynamic programming. , 1999, Journal of molecular biology.

[18]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[19]  Nicola Santoro,et al.  Convergence and aperiodicity in fuzzy cellular automata: Revisiting rule 90 , 1998 .

[20]  A Irbäck,et al.  On hydrophobicity correlations in protein chains. , 2000, Biophysical journal.

[21]  C. Branden,et al.  Introduction to protein structure , 1991 .

[22]  Kuhara,et al.  Prediction of Hydrophobic Cores of Proteins Using Wavelet Analysis. , 1997, Genome informatics. Workshop on Genome Informatics.

[23]  Pokkuluri Kiran Sree,et al.  Face Detection from still and Video Images using Unsupervised Cellular Automata with K means clustering algorithm , 2013, ArXiv.

[24]  A Irbäck,et al.  Evidence for nonrandom hydrophobicity structures in protein chains. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[25]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[26]  I. Cosic,et al.  Is it Possible to Analyze DNA and Protein Sequences by the Methods of Digital Signal Processing? , 1985, IEEE Transactions on Biomedical Engineering.

[27]  R. Lippmann,et al.  An introduction to computing with neural nets , 1987, IEEE ASSP Magazine.

[28]  J M Chandonia,et al.  New methods for accurate prediction of protein secondary structure , 1999, Proteins.

[29]  Andrzej Kolinski,et al.  Computational studies of protein folding , 2001, Comput. Sci. Eng..

[30]  Parimal Pal Chaudhuri,et al.  Fuzzy Cellular Automata for Modeling Pattern Classifier , 2005, IEICE Trans. Inf. Syst..

[31]  Kei Yura,et al.  [Structural bioinformatics]. , 2009, Tanpakushitsu kakusan koso. Protein, nucleic acid, enzyme.