Structural profile matrices for predicting structural properties of proteins.

Predicting structural properties of proteins plays a key role in predicting the 3D structure of proteins. In this study, new structural profile matrices (SPM) are developed for protein secondary structure, solvent accessibility and torsion angle class predictions, which could be used as input to 3D prediction algorithms. The structural templates employed in computing SPMs are detected by eight alignment methods in LOMETS server, gap affine alignment method, ScanProsite, PfamScan, and HHblits. The contribution of each template is weighted by its similarity to target, which is assessed by several sequence alignment scores. For comparison, the SPMs are also computed using Homolpro, which uses BLAST for target template alignments and does not assign weights to templates. Incorporating the SPMs into DSPRED classifier, the prediction accuracy improves significantly as demonstrated by cross-validation experiments on two difficult benchmarks. The most accurate predictions are obtained using the SPMs derived by threading methods in LOMETS server. On the other hand, the computational cost of computing these SPMs was the highest.

[1]  Robert D. Finn,et al.  The Pfam protein families database: towards a more sustainable future , 2015, Nucleic Acids Res..

[2]  Sean R Eddy,et al.  Where did the BLOSUM62 alignment score matrix come from? , 2004, Nature Biotechnology.

[3]  Zafer Aydin,et al.  Template Scoring Methods for Protein Torsion Angle Prediction , 2015, BIOSTEC.

[4]  Jeff A. Bilmes,et al.  Learning sparse models for a dynamic Bayesian network classifier of protein secondary structure , 2011, BMC Bioinformatics.

[5]  Zafer Aydin,et al.  Developing structural profile matrices for protein secondary structure and solvent accessibility prediction , 2019, Bioinform..

[6]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Geoffrey J. Barton,et al.  JPred : a consensus secondary structure prediction server , 1999 .

[8]  Zafer Aydin,et al.  Constructing Structural Profiles for Protein Torsion Angle Prediction , 2015, BIOINFORMATICS.

[9]  Yang Zhang,et al.  The I-TASSER Suite: protein structure and function prediction , 2014, Nature Methods.

[10]  Portland Press Ltd IUPAC-IUB Joint Commission on Biochemical Nomenclature (JCBN). Nomenclature and symbolism for amino acids and peptides. Recommendations 1983 , 1984 .

[11]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[12]  Sitao Wu,et al.  LOMETS: A local meta-threading-server for protein structure prediction , 2007, Nucleic acids research.

[13]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[14]  Brian Kuhlman,et al.  Advances in protein structure prediction and design , 2019, Nature Reviews Molecular Cell Biology.

[15]  Amos Bairoch,et al.  PROSITE: A Documented Database Using Patterns and Profiles as Motif Descriptors , 2002, Briefings Bioinform..

[16]  Nadia Essoussi,et al.  Data mining techniques to predict protein secondary structures , 2013, 2013 5th International Conference on Modeling, Simulation and Applied Optimization (ICMSAO).

[17]  Marc A. Martí-Renom,et al.  EVA: evaluation of protein structure prediction servers , 2003, Nucleic Acids Res..

[18]  Sitao Wu,et al.  MUSTER: Improving protein sequence profile–profile alignments by using multiple sources of structure information , 2008, Proteins.

[19]  Pierre Baldi,et al.  SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity , 2014, Bioinform..

[20]  Yang Zhang,et al.  A comparative assessment and analysis of 20 representative sequence alignment methods for protein structure prediction , 2013, Scientific Reports.

[21]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..