论文信息 - Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates - 字舞流文

Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates

MOTIVATION In recent years, development of a single-method fold-recognition server lags behind consensus and multiple template techniques. However, a good consensus prediction relies on the accuracy of individual methods. This article reports our efforts to further improve a single-method fold recognition technique called SPARKS by changing the alignment scoring function and incorporating the SPINE-X techniques that make improved prediction of secondary structure, backbone torsion angle and solvent accessible surface area. RESULTS The new method called SPARKS-X was tested with the SALIGN benchmark for alignment accuracy, Lindahl and SCOP benchmarks for fold recognition, and CASP 9 blind test for structure prediction. The method is compared to several state-of-the-art techniques such as HHPRED and BoostThreader. Results show that SPARKS-X is one of the best single-method fold recognition techniques. We further note that incorporating multiple templates and refinement in model building will likely further improve SPARKS-X. AVAILABILITY The method is available as a SPARKS-X server at http://sparks.informatics.iupui.edu/

Yaoqi Zhou | Yuedong Yang | Huiying Zhao | Eshel Faraggi | Yaoqi Zhou | Yuedong Yang | Huiying Zhao | E. Faraggi

[1] Jian Peng,et al. Boosting Protein Threading Accuracy , 2009, RECOMB.

[2] Johannes Söding,et al. The HHpred interactive server for protein homology detection and structure prediction , 2005, Nucleic Acids Res..

[3] M. Karplus,et al. Evaluation of comparative protein modeling by MODELLER , 1995, Proteins.

[4] Jian Peng,et al. Low-homology protein threading , 2010, Bioinform..

[5] William H. Press,et al. Numerical Recipes in C, 2nd Edition , 1992 .

[6] Saraswathi Vishveshwara,et al. A graph spectral analysis of the structural similarity network of protein chains , 2005, Proteins.

[7] Yaoqi Zhou,et al. DDOMAIN: Dividing structures into domains using a normalized domain–domain interaction profile , 2007, Protein science : a publication of the Protein Society.

[8] William H. Press,et al. Numerical recipes in C , 2002 .

[9] Yaoqi Zhou,et al. SPARKS 2 and SP3 servers in CASP6 , 2005, Proteins.

[10] E. Lindahl,et al. Identification of related proteins on family, superfamily and fold level. , 2000, Journal of molecular biology.

[11] Lars Malmström,et al. Automated prediction of CASP‐5 structures using the Robetta server , 2003, Proteins.

[12] Arne Elofsson,et al. MaxSub: an automated measure for the assessment of protein structure prediction quality , 2000, Bioinform..

[13] Hongyi Zhou,et al. Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[14] Seung Yup Lee,et al. Analysis of TASSER‐based CASP7 protein structure prediction results , 2007, Proteins.

[15] Yaoqi Zhou,et al. Improving the prediction accuracy of residue solvent accessibility and real‐value backbone torsion angles of proteins by guided‐learning through a two‐layer neural network , 2009, Proteins.

[16] Arne Elofsson,et al. Pcons.net: protein structure prediction meta server , 2007, Nucleic Acids Res..

[17] Genki Terashi,et al. Fams‐ace: A combined method to select the best model after remodeling all server models , 2007, Proteins.

[18] Sitao Wu,et al. MUSTER: Improving protein sequence profile–profile alignments by using multiple sources of structure information , 2008, Proteins.

[19] D T Jones,et al. Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[20] Dong Xu,et al. PROSPECT II: protein structure prediction program for genome-scale applications. , 2003, Protein engineering.

[21] A. Elofsson,et al. Hidden Markov models that use predicted secondary structures for fold recognition , 1999, Proteins.

[22] Yaoqi Zhou,et al. Achieving 80% ten‐fold cross‐validated accuracy for secondary structure prediction by large‐scale training , 2006, Proteins.

[23] Randy J Read,et al. Automated server predictions in CASP7 , 2007, Proteins.

[24] Hongyi Zhou,et al. Single‐body residue‐level knowledge‐based energy score combined with sequence‐profile and secondary structure information for fold recognition , 2004, Proteins.

[25] A G Murzin,et al. SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[26] J. Skolnick,et al. TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[27] Yuedong Yang,et al. Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction. , 2009, Structure.

[28] J. Skolnick,et al. The PDB is a covering set of small protein structures. , 2003, Journal of molecular biology.

[29] Yaoqi Zhou,et al. Ab initio folding of terminal segments with secondary structures reveals the fine difference between two closely related all‐atom statistical energy functions , 2008, Protein science : a publication of the Protein Society.

[30] B. Rost,et al. Protein fold recognition by prediction-based threading. , 1997, Journal of molecular biology.

[31] Michael J E Sternberg,et al. Exploring the extremes of sequence/structure space with ensemble fold recognition in the program Phyre , 2008, Proteins.

[32] M J Sippl,et al. Structure-based evaluation of sequence comparison and fold recognition alignment accuracy. , 2000, Journal of molecular biology.

[33] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[34] J. Bujnicki. Protein-Structure Prediction by Recombination of Fragments , 2006 .

[35] Yang Zhang,et al. Template‐based modeling and free modeling by I‐TASSER in CASP7 , 2007, Proteins.

[36] A. Sali,et al. Alignment of protein sequences by their profiles , 2004, Protein science : a publication of the Protein Society.

[37] J. Skolnick,et al. On the origin and highly likely completeness of single-domain protein structures. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[38] David T. Jones,et al. pGenTHREADER and pDomTHREADER: new methods for improved protein fold recognition and superfamily discrimination , 2009, Bioinform..

[39] Lukasz A. Kurgan,et al. SPINE X: Improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles , 2012, J. Comput. Chem..

[40] Hongyi Zhou,et al. Fold recognition by combining sequence profiles derived from evolution and from depth‐dependent structural alignment of fragments , 2004, Proteins.

[41] Jeffrey Skolnick,et al. Improving threading algorithms for remote homology modeling by combining fragment and template comparisons , 2010, Proteins.

[42] Martin Madera,et al. Profile Comparer: a program for scoring and aligning profile hidden Markov models , 2008, Bioinform..

[43] W. Kabsch,et al. Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[44] Song Liu,et al. Fold recognition by concurrent use of solvent accessibility and residue depth , 2007, Proteins.

[45] T L Blundell,et al. FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. , 2001, Journal of molecular biology.

[46] Pierre Baldi,et al. A machine learning information retrieval approach to protein fold recognition. , 2006, Bioinformatics.

[47] Y. Duan,et al. Trends in template/fragment-free protein structure prediction , 2010, Theoretical chemistry accounts.

[48] Alfonso Valencia,et al. Assessment of predictions submitted for the CASP6 comparative modeling category , 2005, Proteins.

[49] Jinbo Xu. Protein Structure Prediction by Linear Programming , 2003 .

[50] Yaoqi Zhou,et al. Characterizing the existing and potential structural space of proteins by large-scale multiple loop permutations. , 2011, Journal of molecular biology.