eQuant - A Server for Fast Protein Model Quality Assessment by Integrating High-Dimensional Data and Machine Learning

In molecular biology, reliable protein structure models are essential in order to understand the functional role of proteins as well as diseases related to them. Structures are derived by complex and resource-demanding experiments, whereas in silico structure modeling and refinement approaches are established to cope with experimental limitations. Nevertheless, both experimental and computational methods are prone to errors. In consequence, small local regions or even the whole tertiary structure can be unreliable or erroneous, leading the researcher to formulate false hypotheses and draw false conclusions.

[1]  Ian H. Witten,et al.  WEKA: a machine learning workbench , 1994, Proceedings of ANZIIS '94 - Australian New Zealnd Intelligent Information Systems Conference.

[2]  K. Grzeschik,et al.  The precursor of Alzheimer's disease amyloid A4 protein resembles a cell-surface receptor , 1987, Nature.

[3]  Catherine L. Worth,et al.  Structural biology and bioinformatics in drug design: opportunities and challenges for target identification and lead discovery , 2006, Philosophical Transactions of the Royal Society B: Biological Sciences.

[4]  Dirk Labudde,et al.  A Novel Algorithm for Enhanced Structural Motif Matching in Proteins , 2015, J. Comput. Biol..

[5]  Marco Biasini PV - WebGL-based protein viewer , 2014 .

[6]  Liam J. McGuffin,et al.  The ModFOLD4 server for the quality assessment of 3D protein models , 2013, Nucleic Acids Res..

[7]  D. Eisenberg,et al.  Assessment of protein models with three-dimensional profiles , 1992, Nature.

[8]  K. Wüthrich Protein structure determination in solution by NMR spectroscopy. , 1990, The Journal of biological chemistry.

[9]  Hideo Matsuda,et al.  PDB-REPRDB: a database of representative protein chains from the Protein Data Bank (PDB) , 2001, Nucleic Acids Res..

[10]  Torsten Schwede,et al.  Automated comparative protein structure modeling with SWISS‐MODEL and Swiss‐PdbViewer: A historical perspective , 2009, Electrophoresis.

[11]  M J Sippl,et al.  Knowledge-based potentials for proteins. , 1995, Current opinion in structural biology.

[12]  Juergen Haas,et al.  The Protein Model Portal—a comprehensive resource for protein structure and model information , 2013, Database J. Biol. Databases Curation.

[13]  I. Bahar,et al.  Coarse-grained normal mode analysis in structural biology. , 2005, Current opinion in structural biology.

[14]  Pascal Benkert,et al.  QMEAN server for protein model quality estimation , 2009, Nucleic Acids Res..

[15]  M. Forster,et al.  Molecular modelling in structural biology. , 2002, Micron.

[16]  Gennady M Verkhivker,et al.  Empirical free energy calculations of ligand-protein crystallographic complexes. I. Knowledge-based ligand-protein interaction potentials applied to the prediction of human immunodeficiency virus 1 protease binding affinity. , 1995, Protein engineering.

[17]  Saso Dzeroski,et al.  Combining Bagging and Random Subspaces to Create Better Ensembles , 2007, IDA.

[18]  J. Kendrew,et al.  A Three-Dimensional Model of the Myoglobin Molecule Obtained by X-Ray Analysis , 1958, Nature.

[19]  M. Sippl Recognition of errors in three‐dimensional structures of proteins , 1993, Proteins.

[20]  M. Adams,et al.  Recent Segmental Duplications in the Human Genome , 2002, Science.

[21]  Michael Schroeder,et al.  Understanding of SMFS Barriers by Means of Energy Profiles , 2007, German Conference on Bioinformatics.

[22]  Manfred J. Sippl,et al.  Thirty years of environmental health research--and growing. , 1996, Nucleic Acids Res..

[23]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  R C Benedict THE YELLOW CLOTHES MOTH. , 1918, Science.

[25]  Chris Oostenbrink,et al.  A biomolecular force field based on the free enthalpy of hydration and solvation: The GROMOS force‐field parameter sets 53A5 and 53A6 , 2004, J. Comput. Chem..

[26]  Marco Biasini,et al.  SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information , 2014, Nucleic Acids Res..

[27]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[28]  Torsten Schwede,et al.  The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling , 2006, Bioinform..

[29]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[30]  J. Thornton,et al.  PROCHECK: a program to check the stereochemical quality of protein structures , 1993 .

[31]  D T Jones,et al.  Benchmarking template selection and model quality assessment for high‐resolution comparative modeling , 2007, Proteins.

[32]  David S. Wishart,et al.  VADAR: a web server for quantitative evaluation of protein structure quality , 2003, Nucleic Acids Res..

[33]  Yang Zhang,et al.  I-TASSER: a unified platform for automated protein structure and function prediction , 2010, Nature Protocols.

[34]  Gaetano T Montelione,et al.  Evaluating protein structures determined by structural genomics consortia , 2006, Proteins.

[35]  Anna Tramontano,et al.  Assessment of the assessment: Evaluation of the model quality estimates in CASP10 , 2014, Proteins.

[36]  R. Huber,et al.  Accurate Bond and Angle Parameters for X-ray Protein Structure Refinement , 1991 .

[37]  F. Melo,et al.  Novel knowledge-based mean force potential at atomic level. , 1997, Journal of molecular biology.

[38]  Narayanan Eswar,et al.  Protein structure modeling with MODELLER. , 2008, Methods in molecular biology.

[39]  T. Schwede,et al.  QMEANclust: estimation of protein model quality by combining a composite scoring function with structural density information , 2009, BMC Structural Biology.

[40]  N. Go,et al.  Dynamics of a small globular protein in terms of low-frequency vibrational modes. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Peter D. Moore Iranian natural history , 1977, Nature.

[42]  J. Thornton,et al.  AQUA and PROCHECK-NMR: Programs for checking the quality of protein structures solved by NMR , 1996, Journal of biomolecular NMR.

[43]  A. Elofsson,et al.  Local moves: An efficient algorithm for simulation of protein folding , 1995, Proteins.

[44]  W. H. Dall The National Antarctic Expedition, 1901-4 , 1907 .

[45]  T. Blundell,et al.  Structural biology and drug discovery of difficult targets: the limits of ligandability. , 2012, Chemistry & biology.

[46]  I. Kuntz Structure-Based Strategies for Drug Design and Discovery , 1992, Science.

[47]  Andreas Prlic,et al.  BioJava: an open-source framework for bioinformatics in 2012 , 2012, Bioinform..

[48]  David S. Goodsell,et al.  The RCSB Protein Data Bank: new resources for research and education , 2012, Nucleic Acids Res..

[49]  Marco Biasini,et al.  Toward the estimation of the absolute quality of individual protein structure models , 2010, Bioinform..

[50]  Chi-Ren Shyu,et al.  Determining Effects of Non-synonymous SNPs on Protein-Protein Interactions using Supervised and Semi-supervised Learning , 2014, PLoS Comput. Biol..

[51]  ALFRED W. PORTER,et al.  The Refractivity of Radium Emanation , 1909, Nature.

[52]  M. A. Fernandes,et al.  Detection and quantification of microorganisms in a heterogeneous foodstuff by image analysis , 1988, Comput. Appl. Biosci..

[53]  Ugo Bastolla,et al.  Detecting Selection on Protein Stability through Statistical Mechanical Models of Folding and Evolution , 2014, Biomolecules.

[54]  Sebastian Bittrich,et al.  Fit3D: a web application for highly accurate screening of spatial residue patterns in protein structure data , 2016, Bioinform..

[55]  Pascal Benkert,et al.  QMEAN: A comprehensive scoring function for model quality assessment , 2008, Proteins.

[56]  Joost B. Beltman,et al.  Reproducibility of Illumina platform deep sequencing errors allows accurate determination of DNA barcodes in cells , 2016, BMC Bioinformatics.

[57]  P J Parrott PEAR-LEAF BLISTER-MITE (ERIOPHYES PIRI NAL.). , 1906, Science.

[58]  G. N. Ramachandran,et al.  Stereochemistry of polypeptide chain configurations. , 1963, Journal of molecular biology.

[59]  Krzysztof Fidelis,et al.  CASP prediction center infrastructure and evaluation measures in CASP10 and CASP ROLL , 2014, Proteins.

[60]  Francis K. H. Quek,et al.  Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets , 2003, Pattern Recognit..

[61]  D. Bichet,et al.  Molecular biology of hereditary diabetes insipidus. , 2005, Journal of the American Society of Nephrology : JASN.

[62]  G. CHRYSTAL A Christmas Visit to Ben Nevis Observatory , 1884, Nature.

[63]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[64]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[65]  M J Sippl,et al.  Structure-based evaluation of sequence comparison and fold recognition alignment accuracy. , 2000, Journal of molecular biology.

[66]  Manfred J. Sippl,et al.  Boltzmann's principle, knowledge-based mean fields and protein folding. An approach to the computational determination of protein structures , 1993, J. Comput. Aided Mol. Des..

[67]  D. Eisenberg,et al.  VERIFY3D: assessment of protein models with three-dimensional profiles. , 1997, Methods in enzymology.

[68]  Richard E. Dickerson,et al.  50 Years of Protein Structure Analysis , 2009 .

[69]  Björn Wallner,et al.  Improved model quality assessment using ProQ2 , 2012, BMC Bioinformatics.

[70]  Mitsuru Ishizuka,et al.  GeneView: multi-language human gene mapping library with a graphical user interface , 1993, Comput. Appl. Biosci..

[71]  Andreas Prlic,et al.  Sequence analysis , 2003 .

[72]  Ian H. Witten,et al.  Data mining in bioinformatics using Weka , 2004, Bioinform..

[73]  F. Melo,et al.  Assessing protein structures with a non-local atomic interaction energy. , 1998, Journal of molecular biology.

[74]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[75]  Dirk Labudde,et al.  eProS—a database and toolbox for investigating protein sequence–structure–function relationships through energy profiles , 2013, Nucleic Acids Res..

[76]  Francisco Melo,et al.  ANOLEA: A WWW Server to Assess Protein Structures , 1997, ISMB.

[77]  David S. Wishart,et al.  PROSESS: a protein structure evaluation suite and server , 2010, Nucleic Acids Res..

[78]  Dirk Labudde,et al.  Membrane Protein Stability Analyses by Means of Protein Energy Profiles in Case of Nephrogenic Diabetes Insipidus , 2012, Comput. Math. Methods Medicine.

[79]  Berthold Göttgens,et al.  BTR: training asynchronous Boolean models using single-cell expression data , 2016, BMC Bioinformatics.

[80]  David E. Kim,et al.  Free modeling with Rosetta in CASP6 , 2005, Proteins.

[81]  A. Fersht Structure and mechanism in protein science , 1998 .

[82]  Maksymilian Chruszcz,et al.  Benefits of structural genomics for drug discovery research. , 2009, Infectious disorders drug targets.

[83]  Roland L Dunbrack,et al.  Outcome of a workshop on applications of protein models in biomedical research. , 2009, Structure.