An integrated methodology for mining promiscuous proteins: a case study of an integrative bioinformatics approach for hepatitis C virus non-structural 5A protein.

A methodology for elucidation of structural, functional, and mechanistic knowledge on promiscuous proteins is proposed that constitutes a workflow of integrated bioinformatics analysis. Sequence alignments with closely related homologues can reveal conserved regions which are functionally important. Scanning protein motif databases, along with secondary and surface accessibility predictions integrated with post-translational modification sites (PTMs) prediction reveal functional and protein-binding motifs. Integrating this information about the protein with the GO, SCOP, and CATH annotations of the templates can help to formulate a 3D model with reasonable accuracy even in the case of distant sequence homology. A novel integrative model of the non-structural protein 5A of Hepatitis C virus: a hub promiscuous protein with roles in virus replication and host interactions is proposed. The 3D structure for domain II was predicted based on, the Homo sapiens Replication factor-A protein-1 (RPA1), as a template using consensus meta-servers results. Domain III is an intrinsically unstructured domain with a fold from the retroviral matrix protein, which conducts diverse protein interactions and is involved in viral replication and protein interactions. It also has a single-stranded DNA-binding protein motif (SSDP) signature for pyrimidine binding during viral replication. Two protein-binding motifs with high sequence conservation and disordered regions are proposed; the first corresponds to an Interleukin-8B receptor signature (IL-8R-B), while the second has a lymphotoxin beta receptor (LTβR) high local similarity. A mechanism is proposed to their contribution to NS5A Interferon signaling pathway interception. Lastly, the overlapping between LTβR and SSDP is considered as a sign for NS5A date hubs.

[1]  Aliaa A. A. Youssif,et al.  An Integrative In Silico Model of Hepatitis C Virus Non-structural 5a Protein , 2009, BIOCOMP.

[2]  Malay Kumar Basu,et al.  Domain mobility in proteins: functional and evolutionary implications , 2008, Briefings Bioinform..

[3]  W. Pearson Rapid and sensitive sequence comparison with FASTP and FASTA. , 1990, Methods in enzymology.

[4]  Marc S. Cortese,et al.  Flexible nets , 2005, The FEBS journal.

[5]  Aoife McLysaght,et al.  Porter: a new, accurate server for protein secondary structure prediction , 2005, Bioinform..

[6]  Shmuel Pietrokovski,et al.  Increased coverage of protein families with the Blocks Database servers , 2000, Nucleic Acids Res..

[7]  Roland L. Dunbrack,et al.  proteins STRUCTURE O FUNCTION O BIOINFORMATICS Improved prediction of protein side-chain conformations with SCWRL4 , 2022 .

[8]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[9]  A. Lipton,et al.  Human replication protein A: Global fold of the N-terminal RPA-70 domain reveals a basic cleft and flexible C-terminal linker† , 2000, Journal of biomolecular NMR.

[10]  Radu Mateescu,et al.  Validation of qualitative models of genetic regulatory networks by model checking: analysis of the nutritional stress response in Escherichia coli , 2005, ISMB.

[11]  Burkhard Rost,et al.  The PredictProtein server , 2003, Nucleic Acids Res..

[12]  T. A. Hall,et al.  BIOEDIT: A USER-FRIENDLY BIOLOGICAL SEQUENCE ALIGNMENT EDITOR AND ANALYSIS PROGRAM FOR WINDOWS 95/98/ NT , 1999 .

[13]  M. Babu,et al.  The rules of disorder or why disorder rules. , 2009, Progress in biophysics and molecular biology.

[14]  Torsten Schwede,et al.  BIOINFORMATICS Bioinformatics Advance Access published November 12, 2005 The SWISS-MODEL Workspace: A web-based environment for protein structure homology modelling , 2022 .

[15]  Jianlin Cheng,et al.  DOMAC: an accurate, hybrid protein domain prediction server , 2007, Nucleic Acids Res..

[16]  Peer Bork,et al.  SMART: identification and annotation of domains from signalling and extracellular protein sequences , 1999, Nucleic Acids Res..

[17]  Richard Hughey,et al.  SAM‐T04: What is new in protein–structure prediction for CASP6 , 2005, Proteins.

[18]  Ralf Zimmer,et al.  SSEP-Domain: protein domain prediction by alignment of secondary structure elements and profiles , 2006, Bioinform..

[19]  Torbjørn Rognes,et al.  PARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology , 2005, Nucleic Acids Res..

[20]  Caleb Webber,et al.  SCANPS: a web server for iterative protein sequence database searching by dynamic programing, with display in a hierarchical SCOP browser , 2008, Nucleic Acids Res..

[21]  Jimin Pei,et al.  PROMALS: towards accurate multiple sequence alignments of distantly related proteins , 2007, Bioinform..

[22]  Jörg Gsponer,et al.  WITHDRAWN: The rules of disorder or why disorder rules. , 2009, Progress in biophysics and molecular biology.

[23]  Liam J. McGuffin,et al.  The PSIPRED protein structure prediction server , 2000, Bioinform..

[24]  Christian J. A. Sigrist,et al.  Nucleic Acids Research Advance Access published November 14, 2007 The 20 years of PROSITE , 2007 .

[25]  S. Zada,et al.  Natural Genetic Engineering of Hepatitis C Virus NS5A for Immune System Counterattack , 2009, Annals of the New York Academy of Sciences.

[26]  W. Sundquist,et al.  Three-dimensional structure of the HTLV-II matrix protein and comparative analysis of matrix proteins from the different classes of pathogenic human retroviruses. , 1996, Journal of molecular biology.

[27]  Pierre Baldi,et al.  SCRATCH: a protein structure and structural feature prediction server , 2005, Nucleic Acids Res..

[28]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[29]  Leszek Rychlewski,et al.  Evaluation of 3D-Jury on CASP7 models , 2007, BMC Bioinformatics.

[30]  Shuli Kang,et al.  Meta-prediction of phosphorylation sites with weighted voting and restricted grid search parameter selection , 2008, Nucleic acids research.

[31]  Leszek Rychlewski,et al.  LiveBench‐8: The large‐scale, continuous assessment of automated protein structure prediction , 2005, Protein science : a publication of the Protein Society.

[32]  Alessandro Vullo,et al.  Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information , 2007, BMC Bioinformatics.

[33]  Mark Harris,et al.  Hepatitis C virus NS5A: tales of a promiscuous protein. , 2004, The Journal of general virology.