A benchmark server using high resolution protein structure data, and benchmark results for membrane helix predictions

BackgroundHelical membrane proteins are vital for the interaction of cells with their environment. Predicting the location of membrane helices in protein amino acid sequences provides substantial understanding of their structure and function and identifies membrane proteins in sequenced genomes. Currently there is no comprehensive benchmark tool for evaluating prediction methods, and there is no publication comparing all available prediction tools. Current benchmark literature is outdated, as recently determined membrane protein structures are not included. Current literature is also limited to global assessments, as specialised benchmarks for predicting specific classes of membrane proteins were not previously carried out.DescriptionWe present a benchmark server at http://sydney.edu.au/pharmacy/sbio/software/TMH_benchmark.shtml that uses recent high resolution protein structural data to provide a comprehensive assessment of the accuracy of existing membrane helix prediction methods. The server further allows a user to compare uploaded predictions generated by novel methods, permitting the comparison of these novel methods against all existing methods compared by the server. Benchmark metrics include sensitivity and specificity of predictions for membrane helix location and orientation, and many others. The server allows for customised evaluations such as assessing prediction method performances for specific helical membrane protein subtypes.We report results for custom benchmarks which illustrate how the server may be used for specialised benchmarks. Which prediction method is the best performing method depends on which measure is being benchmarked. The OCTOPUS membrane helix prediction method is consistently one of the highest performing methods across all measures in the benchmarks that we performed.ConclusionsThe benchmark server allows general and specialised assessment of existing and novel membrane helix prediction methods. Users can employ this benchmark server to determine the most suitable method for the type of prediction the user needs to perform, be it general whole-genome annotation or the prediction of specific types of helical membrane protein. Creators of novel prediction methods can use this benchmark server to evaluate the performance of their new methods. The benchmark server will be a valuable tool for researchers seeking to extract more sophisticated information from the large and growing protein sequence databases.

[1]  Jaap Heringa,et al.  Protein secondary structure prediction. , 2010, Methods in molecular biology.

[2]  R. Efremov,et al.  Structure of the membrane domain of respiratory complex I , 2011, Nature.

[3]  David T. Jones,et al.  Transmembrane protein topology prediction using support vector machines , 2009, BMC Bioinformatics.

[4]  S. White,et al.  Biophysical dissection of membrane proteins , 2009, Nature.

[5]  John F. Antoniw,et al.  A combinatorial pattern discovery approach for the prediction of membrane dipping (re-entrant) loops , 2006, ISMB.

[6]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[7]  Marco Punta,et al.  Membrane protein prediction methods. , 2007, Methods.

[8]  W R Taylor,et al.  A model recognition approach to the prediction of all-helical membrane protein structure and topology. , 1994, Biochemistry.

[9]  B. Rost,et al.  A modified definition of Sov, a segment‐based measure for protein secondary structure prediction assessment , 1999, Proteins.

[10]  David T. Jones,et al.  Improving the accuracy of transmembrane protein topology prediction using evolutionary information , 2007, Bioinform..

[11]  Erik Granseth,et al.  Structural classification and prediction of reentrant regions in alpha-helical transmembrane proteins: application to complete genomes. , 2006, Journal of molecular biology.

[12]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[13]  Wen-Lian Hsu,et al.  Enhanced membrane protein topology prediction using a hierarchical classification method and a new scoring function. , 2008, Journal of proteome research.

[14]  W. B. Church,et al.  Modeling of the structural features of integral‐membrane proteins reverse‐environment prediction of integral membrane protein structure (REPIMPS) , 2001, Protein science : a publication of the Protein Society.

[15]  Burkhard Rost,et al.  Static benchmarking of membrane helix predictions , 2003, Nucleic Acids Res..

[16]  F. Quiocho,et al.  Crystal structure of a catalytic intermediate of the maltose transporter , 2007, Nature.

[17]  A. Elofsson,et al.  Best α‐helical transmembrane protein topology predictions are achieved using hidden Markov models and evolutionary information , 2004 .

[18]  Konstantinos D. Tsirigos,et al.  A guideline to proteome‐wide α‐helical membrane protein topology predictions , 2012, Proteomics.

[19]  Marcin J. Skwark,et al.  Sequence analysis SPOCTOPUS: a combined predictor of signal peptides and membrane protein topology , 2008 .

[20]  Andrei L. Lomize,et al.  OPM: Orientations of Proteins in Membranes database , 2006, Bioinform..

[21]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[22]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[23]  Narayanaswamy Balakrishnan,et al.  Transmembrane helix prediction using amino acid property features and latent semantic analysis , 2008, BMC Bioinformatics.

[24]  Zsuzsanna Dosztányi,et al.  PDB_TM: selection and membrane localization of transmembrane proteins in the protein data bank , 2004, Nucleic Acids Res..

[25]  Arne Elofsson,et al.  TOPCONS: consensus prediction of membrane protein topology , 2009, Nucleic Acids Res..

[26]  Arne Elofsson,et al.  OCTOPUS: improving topology prediction by two-track ANN-based preference scores and an extended topological grammar , 2008, Bioinform..

[27]  G. von Heijne,et al.  Prediction of membrane-protein topology from first principles , 2008, Proceedings of the National Academy of Sciences.

[28]  BMC Bioinformatics , 2005 .

[29]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[30]  D. Doyle,et al.  Transmembrane helix prediction: a comparative evaluation and analysis. , 2005, Protein engineering, design & selection : PEDS.

[31]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[32]  Zsuzsanna Dosztányi,et al.  Transmembrane proteins in the Protein Data Bank: identification and classification , 2004, Bioinform..

[33]  B. Rost,et al.  Redefining the goals of protein secondary structure prediction. , 1994, Journal of molecular biology.

[34]  Jue Chen,et al.  Alternating access in maltose transporter mediated by rigid-body rotations. , 2009, Molecular cell.

[35]  A. Kernytsky,et al.  Transmembrane helix predictions revisited , 2002, Protein science : a publication of the Protein Society.

[36]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[37]  Yi Wang,et al.  Structure of the formate transporter FocA reveals a pentameric aquaporin-like channel , 2009, Nature.

[38]  Nathan Nelson,et al.  The structure of a plant photosystem I supercomplex at 3.4 Å resolution , 2007, Nature.

[39]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.