General overview on structure prediction of twilight-zone proteins

Protein structure prediction from amino acid sequence has been one of the most challenging aspects in computational structural biology despite significant progress in recent years showed by critical assessment of protein structure prediction (CASP) experiments. When experimentally determined structures are unavailable, the predictive structures may serve as starting points to study a protein. If the target protein consists of homologous region, high-resolution (typically <1.5 Å) model can be built via comparative modelling. However, when confronted with low sequence similarity of the target protein (also known as twilight-zone protein, sequence identity with available templates is less than 30 %), the protein structure prediction has to be initiated from scratch. Traditionally, twilight-zone proteins can be predicted via threading or ab initio method. Based on the current trend, combination of different methods brings an improved success in the prediction of twilight-zone proteins. In this mini review, the methods, progresses and challenges for the prediction of twilight-zone proteins were discussed.

[1]  Yang Zhang,et al.  I-TASSER: a unified platform for automated protein structure and function prediction , 2010, Nature Protocols.

[2]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[3]  R. Othman,et al.  Computational identification of self‐inhibitory peptides from envelope proteins , 2012, Proteins.

[4]  Tharam S. Dillon,et al.  Biomedical Data and Applications , 2009, Biomedical Data and Applications.

[5]  B. Honig,et al.  Protein structure prediction: inroads to biology. , 2005, Molecular cell.

[6]  Yang Zhang,et al.  Segment assembly, structure alignment and iterative simulation in protein structure prediction , 2013, BMC Biology.

[7]  David Baker,et al.  Protein structure prediction and analysis using the Robetta server , 2004, Nucleic Acids Res..

[8]  Yang Zhang,et al.  TASSER: An automated method for the prediction of protein tertiary structures in CASP6 , 2005, Proteins.

[9]  A. Fiser Template-based protein structure modeling. , 2010, Methods in molecular biology.

[10]  K. Battaile,et al.  Ab Initio Structural Modeling of and Experimental Validation for Chlamydia trachomatis Protein CT296 Reveal Structural Similarity to Fe(II) 2-Oxoglutarate-Dependent Enzymes , 2011, Journal of bacteriology.

[11]  Richard Bonneau,et al.  Ab initio protein structure prediction: progress and prospects. , 2001, Annual review of biophysics and biomolecular structure.

[12]  Christopher Bystroff,et al.  Improved pairwise alignment of proteins in the Twilight Zone using local structure predictions , 2005, 2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05).

[13]  John C. Wooley,et al.  A Historical Perspective and Overview of Protein Structure Prediction , 2007 .

[14]  Arne Elofsson,et al.  A study of quality measures for protein threading models , 2001, BMC Bioinformatics.

[15]  Sitao Wu,et al.  LOMETS: A local meta-threading-server for protein structure prediction , 2007, Nucleic acids research.

[16]  Kevin J. Maurice,et al.  SSThread: Template‐free protein structure prediction by threading pairs of contacting secondary structures followed by assembly of overlapping pairs , 2014, J. Comput. Chem..

[17]  Lukasz Kurgan,et al.  Prediction of protein structural class for the twilight zone sequences. , 2007, Biochemical and biophysical research communications.

[18]  M. S. Madhusudhan,et al.  Biological insights from topology independent comparison of protein 3D structures , 2011, Nucleic acids research.

[19]  Ben M. Webb,et al.  Comparative Protein Structure Modeling Using MODELLER , 2016, Current protocols in bioinformatics.

[20]  Johannes Söding,et al.  Comparative analysis of coiled-coil prediction methods. , 2006, Journal of structural biology.

[21]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[22]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[23]  Lars Malmström,et al.  Prediction of CASP6 structures using automated robetta protocols , 2005, Proteins.

[24]  Y. Choong,et al.  The Structure and Dynamics of BmR1 Protein from Brugia malayi: In Silico Approaches , 2014, International journal of molecular sciences.

[25]  W. Pearson,et al.  Current Protocols in Bioinformatics , 2002 .

[26]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[27]  Arne Elofsson,et al.  MaxSub: an automated measure for the assessment of protein structure prediction quality , 2000, Bioinform..

[28]  Ben M. Webb,et al.  Comparative Protein Structure Modeling Using Modeller , 2006, Current protocols in bioinformatics.

[29]  A. Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP)—round IX , 2011, Proteins.

[30]  Lars Malmström,et al.  Automated prediction of CASP‐5 structures using the Robetta server , 2003, Proteins.

[31]  R. Schulz,et al.  Protein Structure Prediction , 2020, Methods in Molecular Biology.

[32]  Jeffery B. Klauda,et al.  Modeling of the major gas vesicle protein, GvpA: from protein sequence to vesicle wall structure. , 2012, Journal of structural biology.

[33]  Jeffrey Skolnick,et al.  Tasser‐Based Protein Structure Prediction , 2010 .

[34]  Dong Xu,et al.  Toward optimal fragment generations for ab initio protein structure assembly , 2013, Proteins.

[35]  M. Levitt,et al.  A unified statistical framework for sequence comparison and structure comparison. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Yang Zhang Interplay of I‐TASSER and QUARK for template‐based and ab initio protein structure prediction in CASP10 , 2014, Proteins.

[37]  B. Rost Twilight zone of protein sequence alignments. , 1999, Protein engineering.

[38]  D. Eisenberg,et al.  An evolutionary approach to folding small alpha-helical proteins that uses sequence information and an empirical guiding fitness function. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[39]  R. Griffin,et al.  Solid-state NMR characterization of gas vesicle structure. , 2010, Biophysical journal.

[40]  Sitao Wu,et al.  MUSTER: Improving protein sequence profile–profile alignments by using multiple sources of structure information , 2008, Proteins.

[41]  Narayanan Eswar,et al.  Protein structure modeling with MODELLER. , 2008, Methods in molecular biology.

[42]  Yang Zhang,et al.  I‐TASSER: Fully automated protein structure prediction in CASP8 , 2009, Proteins.

[43]  Yang Zhang,et al.  A comparative assessment and analysis of 20 representative sequence alignment methods for protein structure prediction , 2013, Scientific Reports.

[44]  Jie Liang,et al.  Computational Methods for Protein Structure Prediction and Modeling , 2007 .

[45]  Aminur Rahman,et al.  In silico and in vivo studies of molecular structures and mechanisms of AtPCS1 protein involved in binding arsenite and/or cadmium in plant cells , 2014, Journal of Molecular Modeling.

[46]  Jinbo Xu,et al.  Protein structure prediction using threading. , 2008, Methods in molecular biology.

[47]  George Karypis,et al.  Introduction to Protein Structure Prediction , 2010 .

[48]  A. Szilágyi,et al.  The twilight zone between protein order and disorder. , 2008, Biophysical journal.

[49]  Kentaro Shimizu,et al.  Development of an ab initio protein structure prediction system ABLE. , 2003, Genome informatics. International Conference on Genome Informatics.

[50]  Jaime Prilusky,et al.  Assessment of CASP8 structure predictions for template free targets , 2009, Proteins.

[51]  Yang Zhang,et al.  Ab Initio structure prediction for Escherichia coli: towards genome-wide protein structure modeling and fold assignment , 2013, Scientific Reports.

[52]  Yang Zhang,et al.  I-TASSER server for protein 3D structure prediction , 2008, BMC Bioinformatics.

[53]  S. Lewandowsky PLOS ONE 2013 , 2015 .

[54]  Samuel L. DeLuca,et al.  Practically Useful: What the Rosetta Protein Modeling Suite Can Do for You , 2010, Biochemistry.

[55]  F E Cohen,et al.  Pairwise sequence alignment below the twilight zone. , 2001, Journal of molecular biology.

[56]  Richard Bonneau,et al.  Rosetta in CASP4: Progress in ab initio protein structure prediction , 2001, Proteins.

[57]  Jian Peng,et al.  Template-based protein structure modeling using the RaptorX web server , 2012, Nature Protocols.

[58]  Anna Maria Almerico,et al.  Molecular dynamics, dynamic site mapping, and highthroughput virtual screening on leptin and the Ob receptor as anti-obesity target , 2014, Journal of Molecular Modeling.

[59]  Lukasz A. Kurgan,et al.  Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences , 2009, BMC Bioinformatics.

[60]  Yang Zhang,et al.  REMO: A new protocol to refine full atomic protein models from C‐alpha traces by optimizing hydrogen‐bonding networks , 2009, Proteins.

[61]  Yang Zhang Progress and challenges in protein structure prediction. , 2008, Current opinion in structural biology.

[62]  V. Uversky,et al.  Why are “natively unfolded” proteins unstructured under physiologic conditions? , 2000, Proteins.

[63]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[64]  Golan Yona,et al.  Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. , 2002, Journal of molecular biology.

[65]  Yang Zhang,et al.  Template‐based modeling and free modeling by I‐TASSER in CASP7 , 2007, Proteins.

[66]  J. Skolnick,et al.  Automated structure prediction of weakly homologous proteins on a genomic scale. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[67]  K. Becker,et al.  Heavy fermion properties of the Kondo Lattice model , 2013, Scientific Reports.

[68]  Yang Zhang,et al.  Ab initio protein structure assembly using continuous structure fragments and optimized knowledge‐based force field , 2012, Proteins.

[69]  M. Mihăşan,et al.  Basic protein structure prediction for the biologist: A review , 2010 .

[70]  Seung Yup Lee,et al.  Analysis of TASSER‐based CASP7 protein structure prediction results , 2007, Proteins.

[71]  Bhyravabhotla Jayaram,et al.  A homology/ab initio hybrid algorithm for sampling near‐native protein conformations , 2013, J. Comput. Chem..

[72]  A. Imberty,et al.  Combination of several bioinformatics approaches for the identification of new putative glycosyltransferases in Arabidopsis. , 2009, Journal of proteome research.

[73]  I. Jurisica,et al.  Knowledge Discovery and interactive Data Mining in Bioinformatics - State-of-the-Art, future challenges and research directions , 2014, BMC Bioinformatics.

[74]  Igor Jurisica,et al.  Knowledge Discovery and interactive Data Mining in Bioinformatics - State-of-the-Art, future challenges and research directions , 2014, BMC Bioinformatics.

[75]  Ke Chen,et al.  Prediction of protein secondary structure content for the twilight zone sequences , 2007, Proteins.

[76]  Jonathan N. Jaworski,et al.  De novo structure prediction and experimental characterization of folded peptoid oligomers , 2012, Proceedings of the National Academy of Sciences.

[77]  Arne Elofsson,et al.  3D-Jury: A Simple Approach to Improve Protein Structure Predictions , 2003, Bioinform..

[78]  P. Argos,et al.  An assessment of amino acid exchange matrices in aligning protein sequences: the twilight zone revisited. , 1995, Journal of molecular biology.

[79]  Wolfgang Wenzel,et al.  Structural model of the gas vesicle protein GvpA and analysis of GvpA mutants in vivo , 2011, Molecular microbiology.

[80]  D Fischer,et al.  LiveBench‐2: Large‐scale automated evaluation of protein structure prediction servers , 2001, Proteins.

[81]  Kam Y. J. Zhang,et al.  Efficient Sampling in Fragment-Based Protein Structure Prediction Using an Estimation of Distribution Algorithm , 2013, PloS one.

[82]  Abdul Sattar,et al.  Genetic Algorithm inAb Initio Protein Structure Prediction Using Low Resolution Model: A Review , 2009, Biomedical Data and Applications.

[83]  J. Skolnick,et al.  Ab initio modeling of small proteins by iterative TASSER simulations , 2007, BMC Biology.