PRIMO: An Interactive Homology Modeling Pipeline

The development of automated servers to predict the three-dimensional structure of proteins has seen much progress over the years. These servers make calculations simpler, but largely exclude users from the process. In this study, we present the PRotein Interactive MOdeling (PRIMO) pipeline for homology modeling of protein monomers. The pipeline eases the multi-step modeling process, and reduces the workload required by the user, while still allowing engagement from the user during every step. Default parameters are given for each step, which can either be modified or supplemented with additional external input. PRIMO has been designed for users of varying levels of experience with homology modeling. The pipeline incorporates a user-friendly interface that makes it easy to alter parameters used during modeling. During each stage of the modeling process, the site provides suggestions for novice users to improve the quality of their models. PRIMO provides functionality that allows users to also model ligands and ions in complex with their protein targets. Herein, we assess the accuracy of the fully automated capabilities of the server, including a comparative analysis of the available alignment programs, as well as of the refinement levels used during modeling. The tests presented here demonstrate the reliability of the PRIMO server when producing a large number of protein models. While PRIMO does focus on user involvement in the homology modeling process, the results indicate that in the presence of suitable templates, good quality models can be produced even without user intervention. This gives an idea of the base level accuracy of PRIMO, which users can improve upon by adjusting parameters in their modeling runs. The accuracy of PRIMO’s automated scripts is being continuously evaluated by the CAMEO (Continuous Automated Model EvaluatiOn) project. The PRIMO site is free for non-commercial use and can be accessed at https://primo.rubi.ru.ac.za/.

[1]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[2]  N. Grishin,et al.  PROMALS3D: a tool for multiple protein sequence and structure alignments , 2008, Nucleic acids research.

[3]  Karin M. Verspoor,et al.  Annotating the biomedical literature for the human variome , 2013, Database J. Biol. Databases Curation.

[4]  Yang Zhang,et al.  I-TASSER server: new development for protein structure and function predictions , 2015, Nucleic Acids Res..

[5]  Yang Zhang Progress and challenges in protein structure prediction. , 2008, Current opinion in structural biology.

[6]  Xuan Li,et al.  Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study , 2011, BMC Bioinformatics.

[7]  Cédric Notredame,et al.  3DCoffee: combining protein sequences and structures within multiple sequence alignments. , 2004, Journal of molecular biology.

[8]  A. Sali,et al.  Statistical potential for assessment and prediction of protein structures , 2006, Protein science : a publication of the Protein Society.

[9]  Özlem Tastan Bishop,et al.  Study of protein complexes via homology modeling, applied to cysteine proteases and their protein inhibitors , 2011, Journal of molecular modeling.

[10]  Marco Biasini,et al.  lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests , 2013, Bioinform..

[11]  Tjaart A. P. de Beer,et al.  Protein homology modelling and its use in South Africa , 2008 .

[12]  Johannes Söding,et al.  The HHpred interactive server for protein homology detection and structure prediction , 2005, Nucleic Acids Res..

[13]  Melanie D Ohi,et al.  Cryo-electron microscopy and the amazing race to atomic resolution. , 2015, Biochemistry.

[14]  Marc A. Martí-Renom,et al.  MODBASE: a database of annotated comparative protein structure models and associated resources , 2005, Nucleic Acids Res..

[15]  Torsten Schwede,et al.  Protein modeling: what happened to the "protein structure gap"? , 2013, Structure.

[16]  Özlem Tastan Bishop,et al.  H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa , 2016, Genome research.

[17]  R. Hilgenfeld,et al.  Utility of homology models in the drug discovery process , 2004, Drug Discovery Today.

[18]  Marco Biasini,et al.  SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information , 2014, Nucleic Acids Res..

[19]  Marco Biasini,et al.  pv: v1.8.1 , 2015 .

[20]  K. Katoh,et al.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.

[21]  Yang Zhang,et al.  Template-based structure modeling of protein-protein interactions. , 2014, Current opinion in structural biology.

[22]  Patrice Koehl,et al.  A quality metric for homology modeling: the H-factor , 2011, BMC Bioinformatics.

[23]  J. Thornton,et al.  PROCHECK: a program to check the stereochemical quality of protein structures , 1993 .

[24]  D. Higgins,et al.  Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega , 2011, Molecular systems biology.

[25]  D. Eisenberg,et al.  VERIFY3D: assessment of protein models with three-dimensional profiles. , 1997, Methods in enzymology.

[26]  Barry Honig,et al.  Structural bioinformatics of the interactome. , 2014, Annual review of biophysics.

[27]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[28]  D. Eisenberg,et al.  Assessment of protein models with three-dimensional profiles , 1992, Nature.

[29]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[30]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[31]  Juergen Haas,et al.  The Protein Model Portal—a comprehensive resource for protein structure and model information , 2013, Database J. Biol. Databases Curation.

[32]  Baldomero Oliva,et al.  MODELLER: A Program for Protein Structure Modeling , 2013 .

[33]  Benjamin R. Jefferys,et al.  Protein Folding Requires Crowd Control in a Simulated Cell , 2010, Journal of molecular biology.

[34]  Manfred J. Sippl,et al.  Thirty years of environmental health research--and growing. , 1996, Nucleic Acids Res..

[35]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[36]  Michael J E Sternberg,et al.  The Phyre2 web portal for protein modeling, prediction and analysis , 2015, Nature Protocols.

[37]  SödingJohannes Protein homology detection by HMM--HMM comparison , 2005 .

[38]  M. McCarthy,et al.  Research Capacity: Enabling African Scientists to Engage Fully in the Genomic Revolution , 2014 .

[39]  David K. Brown,et al.  JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing , 2015, PloS one.

[40]  Johannes Söding,et al.  Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[41]  Pascal Benkert,et al.  QMEAN: A comprehensive scoring function for model quality assessment , 2008, Proteins.

[42]  A. Elofsson,et al.  Structure is three to ten times more conserved than sequence—A study of structural response in protein cores , 2009, Proteins.

[43]  Faisal M. Fadlelmola,et al.  Enabling Genomic Revolution in Africa , 2019, The Genetics of African Populations in Health and Disease.

[44]  Fabrice Armougom,et al.  Expresso: automatic incorporation of structural information in multiple sequence alignments using 3D-Coffee , 2006, Nucleic Acids Res..

[45]  Yang Zhang Protein structure prediction: when is it useful? , 2009, Current opinion in structural biology.

[46]  Michael J. E. Sternberg,et al.  3DLigandSite: predicting ligand-binding sites using similar structures , 2010, Nucleic Acids Res..