Computational approaches for protein function prediction: a combined strategy from multiple sequence alignment to molecular docking-based virtual screening.

The functional characterization of proteins represents a daily challenge for biochemical, medical and computational sciences. Although finally proved on the bench, the function of a protein can be successfully predicted by computational approaches that drive the further experimental assays. Current methods for comparative modeling allow the construction of accurate 3D models for proteins of unknown structure, provided that a crystal structure of a homologous protein is available. Binding regions can be proposed by using binding site predictors, data inferred from homologous crystal structures, and data provided from a careful interpretation of the multiple sequence alignment of the investigated protein and its homologs. Once the location of a binding site has been proposed, chemical ligands that have a high likelihood of binding can be identified by using ligand docking and structure-based virtual screening of chemical libraries. Most docking algorithms allow building a list sorted by energy of the lowest energy docking configuration for each ligand of the library. In this review the state-of-the-art of computational approaches in 3D protein comparative modeling and in the study of protein-ligand interactions is provided. Furthermore a possible combined/concerted multistep strategy for protein function prediction, based on multiple sequence alignment, comparative modeling, binding region prediction, and structure-based virtual screening of chemical libraries, is described by using suitable examples. As practical examples, Abl-kinase molecular modeling studies, HPV-E6 protein multiple sequence alignment analysis, and some other model docking-based characterization reports are briefly described to highlight the importance of computational approaches in protein function prediction.

[1]  P. Lindley,et al.  The structure of avian eye lens δ-crystallin reveals a new fold for a superfamily of oligomeric enzymes , 1994, Nature Structural Biology.

[2]  Garrett M Morris,et al.  Using AutoDock for Ligand‐Receptor Docking , 2008, Current protocols in bioinformatics.

[3]  Leszek Konieczny,et al.  Ligation site in proteins recognized in silico , 2006, Bioinformation.

[4]  Andrzej Kolinski,et al.  Contact prediction in protein modeling: Scoring, folding and refinement of coarse-grained models , 2008, BMC Structural Biology.

[5]  M. Schroeder,et al.  Using protein binding site prediction to improve protein docking. , 2008, Gene.

[6]  R F Doolittle,et al.  Progressive alignment and phylogenetic tree construction of protein sequences. , 1990, Methods in enzymology.

[7]  Liam J. McGuffin,et al.  The PSIPRED protein structure prediction server , 2000, Bioinform..

[8]  R F Doolittle,et al.  Progressive alignment of amino acid sequences and construction of phylogenetic trees from them. , 1996, Methods in enzymology.

[9]  R. Agarwala,et al.  Protein database searches using compositionally adjusted substitution matrices , 2005, The FEBS journal.

[10]  Inna Dubchak,et al.  Glocal alignment: finding rearrangements during alignment , 2003, ISMB.

[11]  H. Watson,et al.  Sequence and structure of yeast phosphoglycerate kinase. , 1982, The EMBO journal.

[12]  J. Hoheisel,et al.  Human Papillomavirus Type 16 E6 Promotes Retinoblastoma Protein Phosphorylation and Cell Cycle Progression , 2004, Journal of Virology.

[13]  Alasdair T. R. Laurie,et al.  Methods for the prediction of protein-ligand binding sites for structure-based drug design and virtual ligand screening. , 2006, Current protein & peptide science.

[14]  Brian K. Shoichet,et al.  ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..

[15]  Donna Karolchik,et al.  UCSC genome browser: deep support for molecular biomedical research. , 2008, Biotechnology annual review.

[16]  Andreas Bender,et al.  Plate-Based Diversity Selection Based on Empirical HTS Data to Enhance the Number of Hits and Their Chemical Diversity , 2009, Journal of biomolecular screening.

[17]  S. Altschul A protein alignment scoring system sensitive at all evolutionary distances , 1993, Journal of Molecular Evolution.

[18]  Marc A. Martí-Renom,et al.  DBAli tools: mining the protein structure space , 2007, Nucleic Acids Res..

[19]  Richard M. Jackson,et al.  Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites , 2005, Bioinform..

[20]  Edgar Wingender,et al.  The TRANSFAC project as an example of framework technology that supports the analysis of genomic regulation , 2008, Briefings Bioinform..

[21]  A. Beaudet,et al.  Requirement of E6AP and the features of human papillomavirus E6 necessary to support degradation of p53. , 2003, Virology.

[22]  R. Doolittle,et al.  Nearest neighbor procedure for relating progressively aligned amino acid sequences. , 1990, Methods in enzymology.

[23]  LobleyAnna,et al.  pGenTHREADER and pDomTHREADER , 2009 .

[24]  David S. Goodsell,et al.  AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility , 2009, J. Comput. Chem..

[25]  M Karplus,et al.  Analysis of side-chain orientations in homologous proteins. , 1987, Journal of molecular biology.

[26]  M J Sternberg,et al.  Supersites within superfolds. Binding site similarity in the absence of homology. , 1998, Journal of molecular biology.

[27]  Taner Z Sen,et al.  Generation and enumeration of compact conformations on the two-dimensional triangular and three-dimensional fcc lattices. , 2007, The Journal of chemical physics.

[28]  S. J. Campbell,et al.  Ligand binding: functional site location, similarity and docking. , 2003, Current opinion in structural biology.

[29]  G J Barton,et al.  Evaluation and improvements in the automatic alignment of protein sequences. , 1987, Protein engineering.

[30]  Rebecca C Wade,et al.  Bridging from molecular simulation to biochemical networks. , 2007, Current opinion in structural biology.

[31]  R. Hegde,et al.  Solution structure of the hDlg/SAP97 PDZ2 domain and its mechanism of interaction with HPV-18 papillomavirus E6 protein. , 2007, Biochemistry.

[32]  Nicolas L. Fawzi,et al.  Protein folding by distributed computing and the denatured state ensemble. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Rodrigo Lopez,et al.  Multiple sequence alignment with the Clustal series of programs , 2003, Nucleic Acids Res..

[34]  Tao Jiang,et al.  On the Complexity of Multiple Sequence Alignment , 1994, J. Comput. Biol..

[35]  J. A. Grant,et al.  Gaussian docking functions. , 2003, Biopolymers.

[36]  Miranda Thomas,et al.  HPV E6 and MAGUK protein interactions: determination of the molecular basis for specific protein recognition and degradation , 2001, Oncogene.

[37]  Liam J. McGuffin,et al.  Improvement of the GenTHREADER Method for Genomic Fold Recognition , 2003, Bioinform..

[38]  Mark A. Murcko,et al.  Virtual screening : an overview , 1998 .

[39]  M. Sternberg,et al.  A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons. , 1987, Journal of molecular biology.

[40]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[41]  Alexander Martin,et al.  BisoGenet: a new tool for gene network building, visualization and analysis , 2010, BMC Bioinformatics.

[42]  Rommie E. Amaro,et al.  An improved relaxed complex scheme for receptor flexibility in computer-aided drug design , 2008, J. Comput. Aided Mol. Des..

[43]  András Fiser,et al.  ModLoop: automated modeling of loops in protein structures , 2003, Bioinform..

[44]  Niladri Ganguly,et al.  Human papillomavirus E6 and E7 oncoproteins as risk factors for tumorigenesis , 2009, Journal of Biosciences.

[45]  Michal Brylinski,et al.  FINDSITELHM: A Threading-Based Approach to Ligand Homology Modeling , 2009, PLoS Comput. Biol..

[46]  S Henikoff,et al.  Performance evaluation of amino acid substitution matrices , 1993, Proteins.

[47]  Sally A. Hindle,et al.  The FlexX database docking environment--rational extraction of receptor based pharmacophores. , 2004, Current drug discovery technologies.

[48]  J. Skolnick,et al.  A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation , 2008, Proceedings of the National Academy of Sciences.

[49]  Manuel C. Peitsch,et al.  SWISS-MODEL: an automated protein homology-modeling server , 2003, Nucleic Acids Res..

[50]  Russ B Altman,et al.  Improving structure-based function prediction using molecular dynamics. , 2009, Structure.

[51]  Wyeth W. Wasserman,et al.  A new generation of JASPAR, the open-access repository for transcription factor binding site profiles , 2005, Nucleic Acids Res..

[52]  Toshiyuki Sato,et al.  In silico Screening of protein-protein Interactions with All-to-All Rigid docking and Clustering: an Application to Pathway Analysis , 2009, J. Bioinform. Comput. Biol..

[53]  M. Tommasino,et al.  The biological properties of E6 and E7 oncoproteins from human papillomaviruses , 2010, Virus Genes.

[54]  D. Dell’Orco,et al.  Fast predictions of thermodynamics and kinetics of protein-protein recognition from structures: from molecular design to systems biology. , 2009, Molecular bioSystems.

[55]  Emil Alexov,et al.  Calculating the Protonation States of Proteins and Small Molecules: Implications to Ligand-Receptor Interactions , 2008 .

[56]  A. Konagurthu,et al.  MUSTANG: A multiple structural alignment algorithm , 2006, Proteins.

[57]  Luhua Lai,et al.  Further development and validation of empirical scoring functions for structure-based binding affinity prediction , 2002, J. Comput. Aided Mol. Des..

[58]  J. Thornton,et al.  A method for localizing ligand binding pockets in protein structures , 2005, Proteins.

[59]  Jakub Pas,et al.  Ligand.Info small-molecule Meta-Database. , 2004, Combinatorial chemistry & high throughput screening.

[60]  M. He,et al.  PPI Finder: A Mining Tool for Human Protein-Protein Interactions , 2009, PloS one.

[61]  Gert Vriend Protein Design: Quo Vadis? , 2004, Science.

[62]  D T Jones,et al.  Benchmarking template selection and model quality assessment for high‐resolution comparative modeling , 2007, Proteins.

[63]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[64]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[65]  A. Sali,et al.  Modeling of loops in protein structures , 2000, Protein science : a publication of the Protein Society.

[66]  Nadia Essoussi,et al.  A comparison of MSA tools , 2008, Bioinformation.

[67]  D. van der Spoel,et al.  Blind docking of drug‐sized compounds to proteins with up to a thousand residues , 2006, FEBS letters.

[68]  Kimmen Sjölander,et al.  Berkeley PHOG: PhyloFacts orthology group prediction web server , 2009, Nucleic Acids Res..

[69]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[70]  G. Klebe,et al.  Knowledge-based scoring function to predict protein-ligand interactions. , 2000, Journal of molecular biology.

[71]  Desmond G. Higgins,et al.  Evaluation of iterative alignment algorithms for multiple alignment , 2005, Bioinform..

[72]  G. Vacek,et al.  Trends in High-Performance Computing Requirements for Computer- Aided Drug Design , 2008 .

[73]  David E. Kim,et al.  Free modeling with Rosetta in CASP6 , 2005, Proteins.

[74]  L. A. Basso,et al.  Virtual Screening of Drugs: Score Functions, Docking, and Drug Design , 2008 .

[75]  P. A. Babu,et al.  Virtual screening for novel COX-2 inhibitors using the ZINC database , 2008, Bioinformation.

[76]  C. E. Peishoff,et al.  A critical assessment of docking programs and scoring functions. , 2006, Journal of medicinal chemistry.

[77]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[78]  Juho Rousu,et al.  Computational methods for metabolic reconstruction. , 2010, Current opinion in biotechnology.

[79]  Matthew P. Repasky,et al.  Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. , 2004, Journal of medicinal chemistry.

[80]  B. Snel,et al.  STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. , 2000, Nucleic acids research.

[81]  Narayanan Eswar,et al.  Protein structure modeling with MODELLER. , 2008, Methods in molecular biology.

[82]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[83]  Richard D. Taylor,et al.  Improved protein–ligand docking using GOLD , 2003, Proteins.

[84]  David S. Wishart,et al.  VADAR: a web server for quantitative evaluation of protein structure quality , 2003, Nucleic Acids Res..

[85]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[86]  Richard D. Taylor,et al.  Modeling water molecules in protein-ligand docking using GOLD. , 2005, Journal of medicinal chemistry.

[87]  Christophe Blanchet,et al.  Protein-DNA binding specificity: a grid-enabled computational approach applied to single and multiple protein assemblies. , 2009, Physical chemistry chemical physics : PCCP.

[88]  A. Kolinski,et al.  Characterization of protein-folding pathways by reduced-space modeling , 2007, Proceedings of the National Academy of Sciences.

[89]  Meir Glick,et al.  Enrichment of High-Throughput Screening Data with Increasing Levels of Noise Using Support Vector Machines, Recursive Partitioning, and Laplacian-Modified Naive Bayesian Classifiers , 2006, J. Chem. Inf. Model..

[90]  Eileen M. Burd,et al.  Human Papillomavirus and Cervical Cancer , 1988, The Lancet.

[91]  Andrey Tovchigrechko,et al.  GRAMM-X public web server for protein–protein docking , 2006, Nucleic Acids Res..

[92]  C. V. Jongeneel,et al.  The M-Coffee web server: a meta-method for computing multiple sequence alignments by combining alternative alignment methods , 2007, Nucleic Acids Res..

[93]  Narayanaswamy Srinivasan,et al.  Nucleic Acids Research Advance Access published June 21, 2007 PIC: Protein Interactions Calculator , 2007 .

[94]  E. Androphy,et al.  Identification of an α Helical Motif Sufficient for Association with Papillomavirus E6* , 1998, The Journal of Biological Chemistry.

[95]  G. V. Paolini,et al.  Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes , 1997, J. Comput. Aided Mol. Des..

[96]  K Schulten,et al.  VMD: visual molecular dynamics. , 1996, Journal of molecular graphics.

[97]  T. Iftner,et al.  Comparative Analysis of 19 Genital Human Papillomavirus Types with Regard to p53 Degradation, Immortalization, Phylogeny, and Epidemiologic Risk Classification , 2006, Cancer Epidemiology Biomarkers & Prevention.

[98]  Hege S. Beard,et al.  Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. , 2004, Journal of medicinal chemistry.

[99]  M. Froimowitz,et al.  HyperChem: a software package for computational chemistry and molecular modeling. , 1993, BioTechniques.

[100]  G. Schneider,et al.  PocketPicker: analysis of ligand binding-sites with shape descriptors , 2007, Chemistry Central Journal.

[101]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[102]  Krzysztof Ginalski,et al.  The interactome: predicting the protein-protein interactions in cells. , 2009, Cellular & molecular biology letters.

[103]  A. Goede,et al.  Loops In Proteins (LIP)--a comprehensive loop database for homology modelling. , 2003, Protein engineering.

[104]  Renxiao Wang,et al.  Comparative evaluation of 11 scoring functions for molecular docking. , 2003, Journal of medicinal chemistry.

[105]  Megan L. Peach,et al.  Directed discovery of agents targeting the Met tyrosine kinase domain by virtual screening. , 2009, Journal of medicinal chemistry.

[106]  M. Akke,et al.  Conformational entropy changes upon lactose binding to the carbohydrate recognition domain of galectin-3 , 2009, Journal of biomolecular NMR.

[107]  R. Langlois,et al.  Boosting the prediction and understanding of DNA-binding domains from sequence , 2010, Nucleic acids research.

[108]  J. Skolnick,et al.  Ab initio protein structure prediction using chunk-TASSER. , 2007, Biophysical journal.

[109]  M. Schroeder,et al.  LIGSITEcsc: predicting ligand binding sites using the Connolly surface and degree of conservation , 2006, BMC Structural Biology.

[110]  Gert Vriend,et al.  Correcting ligands, metabolites, and pathways , 2006, BMC Bioinformatics.

[111]  A. Sali,et al.  Comparative protein structure modeling by iterative alignment, model building and model assessment. , 2003, Nucleic acids research.

[112]  W. W. Jong,et al.  The enzyme lactate dehydrogenase as a structural protein in avian and crocodilian lenses , 1987, Nature.

[113]  D S Goodsell,et al.  Automated docking of flexible ligands: Applications of autodock , 1996, Journal of molecular recognition : JMR.

[114]  S. Altschul Amino acid substitution matrices from an information theoretic perspective , 1991, Journal of Molecular Biology.

[115]  A. Valencia,et al.  Practical limits of function prediction , 2000, Proteins.

[116]  R. Abagyan,et al.  Pocketome via Comprehensive Identification and Classification of Ligand Binding Envelopes* , 2005, Molecular & Cellular Proteomics.

[117]  Antonio Turi,et al.  Lattices for ab initio protein structure prediction , 2008, Proteins.

[118]  Thomas Schlitt,et al.  Protein-protein interaction databases: keeping up with growing interactomes , 2009, Human Genomics.

[119]  Dominik Gront,et al.  Comparative modeling without implicit sequence alignments , 2007, Bioinform..

[120]  D. Higgins,et al.  Multiple sequence alignments. , 2005, Current opinion in structural biology.

[121]  Todd J. A. Ewing,et al.  DOCK 4.0: Search strategies for automated molecular docking of flexible molecule databases , 2001, J. Comput. Aided Mol. Des..

[122]  John E Straub,et al.  Structure optimization and folding mechanisms of off-lattice protein models using statistical temperature molecular dynamics simulation: Statistical temperature annealing. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[123]  E. Weiss,et al.  Binding of human papillomavirus 16 E6 to p53 and E6AP is impaired by monoclonal antibodies directed against the second zinc-binding domain of E6. , 2005, The Journal of general virology.

[124]  Yanli Wang,et al.  A survey of across-target bioactivity results of small molecules in PubChem , 2009, Bioinform..

[126]  J. Lynch,et al.  Ligand-specific Conformational Changes in the α1 Glycine Receptor Ligand-binding Domain* , 2009, The Journal of Biological Chemistry.

[127]  Julie D Thompson,et al.  Multiple Sequence Alignment Using ClustalW and ClustalX , 2003, Current protocols in bioinformatics.

[128]  M. Karplus,et al.  PDB-based protein loop prediction: parameters for selection and methods for optimization. , 1997, Journal of molecular biology.

[129]  J. Thornton,et al.  Stereochemical quality of protein structure coordinates , 1992, Proteins.

[130]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[131]  Jens Meiler,et al.  Rosetta predictions in CASP5: Successes, failures, and prospects for complete automation , 2003, Proteins.

[132]  Geoffrey J. Barton,et al.  Jalview Version 2—a multiple sequence alignment editor and analysis workbench , 2009, Bioinform..

[133]  M Fujita,et al.  Binding of high-risk human papillomavirus E6 oncoproteins to the human homologue of the Drosophila discs large tumor suppressor protein. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[134]  W. Taylor A flexible method to align large numbers of biological sequences , 2005, Journal of Molecular Evolution.

[135]  N. Guex,et al.  SWISS‐MODEL and the Swiss‐Pdb Viewer: An environment for comparative protein modeling , 1997, Electrophoresis.

[136]  H. Edelsbrunner,et al.  Anatomy of protein pockets and cavities: Measurement of binding site geometry and implications for ligand design , 1998, Protein science : a publication of the Protein Society.

[137]  T. M. Mohan,et al.  Computer-Aided Drug Design for Cancer-Causing H-Ras p21 Mutant Protein , 2009 .

[138]  Thomas Lengauer,et al.  A fast flexible docking method using an incremental construction algorithm. , 1996, Journal of molecular biology.

[139]  Sándor Pongor,et al.  Benchmarking protein classification algorithms via supervised cross-validation. , 2008, Journal of biochemical and biophysical methods.

[140]  R. Moreno-Sánchez,et al.  Molecular basis of the unusual catalytic preference for GDP/GTP in Entamoeba histolytica 3‐phosphoglycerate kinase , 2009, The FEBS journal.

[141]  Damian Smedley,et al.  BioMart – biological queries made easy , 2009, BMC Genomics.

[142]  Gary D Bader,et al.  Computational Prediction of Protein–Protein Interactions , 2008, Molecular biotechnology.

[143]  P Willett,et al.  Development and validation of a genetic algorithm for flexible docking. , 1997, Journal of molecular biology.

[144]  J. Peto,et al.  Human papillomavirus is a necessary cause of invasive cervical cancer worldwide , 1999, The Journal of pathology.

[145]  Yi Zhang,et al.  Structures of a Human Papillomavirus (HPV) E6 Polypeptide Bound to MAGUK Proteins: Mechanisms of Targeting Tumor Suppressors by a High-Risk HPV Oncoprotein , 2007, Journal of Virology.

[146]  Michal Brylinski,et al.  Q‐Dock: Low‐resolution flexible ligand docking with pocket‐specific threading restraints , 2008, J. Comput. Chem..

[147]  Peer Bork,et al.  Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation , 2007, Bioinform..

[148]  R. Doolittle,et al.  Aligning amino acid sequences: Comparison of commonly used methods , 1985, Journal of Molecular Evolution.

[149]  Eileen Kraemer,et al.  SynView: a GBrowse-compatible approach to visualizing comparative genome data , 2006, Bioinform..

[150]  Didier Rognan,et al.  How to Measure the Similarity Between Protein Ligand-Binding Sites? , 2008 .

[151]  B. Roth ACAT inhibitors : evolution from cholesterol-absorption inhibitors to antiatherosclerotic agents , 1998 .

[152]  S. Karlin,et al.  Applications and statistics for multiple high-scoring segments in molecular sequences. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[153]  Alejandro A. Schäffer,et al.  PSI-BLAST pseudocounts and the minimum description length principle , 2008, Nucleic acids research.

[154]  Yi Zhou,et al.  BLASTO: a tool for searching orthologous groups , 2007, Nucleic Acids Res..

[155]  Michael S. Waterman,et al.  Algorithms for restriction map comparisons , 1984, Nucleic Acids Res..

[156]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[157]  Steven J. M. Jones,et al.  Circos: an information aesthetic for comparative genomics. , 2009, Genome research.

[158]  SHENG-YOU HUANG,et al.  An iterative knowledge‐based scoring function to predict protein–ligand interactions: I. Derivation of interaction potentials , 2006, J. Comput. Chem..

[159]  Michael Y. Galperin,et al.  Using the COG Database to Improve Gene Recognition in Complete Genomes , 2004, Genetica.

[160]  Rafael Ördög PyDeT, a PyMOL plug-in for visualizing geometric concepts around proteins , 2008 .

[161]  S. Sudarsanam,et al.  Modeling protein loops using a ϕi+1, Ψi dimer database , 1995, Protein science : a publication of the Protein Society.

[162]  J. Thornton,et al.  AQUA and PROCHECK-NMR: Programs for checking the quality of protein structures solved by NMR , 1996, Journal of biomolecular NMR.

[163]  David T. Jones,et al.  pGenTHREADER and pDomTHREADER: new methods for improved protein fold recognition and superfamily discrimination , 2009, Bioinform..

[164]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[165]  D. Baker,et al.  Multipass membrane protein structure prediction using Rosetta , 2005, Proteins.

[166]  D. P. Wall,et al.  Detecting putative orthologs , 2003, Bioinform..

[167]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[168]  S. Altschul,et al.  A tool for multiple sequence alignment. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[169]  Ian M. Donaldson,et al.  iRefIndex: A consolidated protein interaction database with provenance , 2008, BMC Bioinformatics.

[170]  M. Gerstein,et al.  Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. , 2000, Journal of molecular biology.

[171]  E. Androphy,et al.  Solution structure determination and mutational analysis of the papillomavirus E6 interacting peptide of E6AP. , 2001, Biochemistry.

[172]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[173]  Annabel E. Todd,et al.  Evolution of function in protein superfamilies, from a structural perspective. , 2001, Journal of molecular biology.

[174]  M C Peitsch,et al.  Protein modelling for all. , 1999, Trends in biochemical sciences.

[175]  Xiaoqin Zou,et al.  An iterative knowledge‐based scoring function to predict protein–ligand interactions: II. Validation of the scoring function , 2006, J. Comput. Chem..

[176]  D. Rognan,et al.  Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. , 2000, Journal of medicinal chemistry.

[177]  Jorja G. Henikoff,et al.  PHAT: a transmembrane-specific substitution matrix , 2000, Bioinform..

[178]  Limsoon Wong,et al.  Finding functional promoter motifs by computational methods: a word of caution , 2006, Int. J. Bioinform. Res. Appl..

[179]  Ruth Nussinov,et al.  PatchDock and SymmDock: servers for rigid and symmetric docking , 2005, Nucleic Acids Res..

[180]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[181]  Sandor Vajda,et al.  ClusPro: an automated docking and discrimination method for the prediction of protein complexes , 2004, Bioinform..

[182]  Sean R Eddy,et al.  Where did the BLOSUM62 alignment score matrix come from? , 2004, Nature Biotechnology.

[183]  Xavier Castellsagué,et al.  Chapter 1: HPV in the etiology of human cancer. , 2006, Vaccine.

[184]  Bernard F. Buxton,et al.  Secondary structure prediction with support vector machines , 2003, Bioinform..

[185]  J. Scott Dixon,et al.  Flexible ligand docking using a genetic algorithm , 1995, J. Comput. Aided Mol. Des..

[186]  S. Caldeira,et al.  The role of TP53 in Cervical carcinogenesis , 2003, Human mutation.

[187]  David T. Jones,et al.  Transmembrane protein topology prediction using support vector machines , 2009, BMC Bioinformatics.

[188]  Lance M. Westerhoff,et al.  A critical assessment of the performance of protein-ligand scoring functions based on NMR chemical shift perturbations. , 2007, Journal of medicinal chemistry.

[189]  Robin Taylor,et al.  A new test set for validating predictions of protein–ligand interaction , 2002, Proteins.

[190]  Stephen R Comeau,et al.  ClusPro: Performance in CAPRI rounds 6–11 and the new server , 2007, Proteins.

[191]  David T. Jones,et al.  Improving the accuracy of transmembrane protein topology prediction using evolutionary information , 2007, Bioinform..

[192]  Christian von Mering,et al.  STRING 7—recent developments in the integration and prediction of protein interactions , 2006, Nucleic Acids Res..

[193]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[194]  M Ghosh,et al.  A 1.8 A resolution structure of pig muscle 3-phosphoglycerate kinase with bound MgADP and 3-phosphoglycerate in open conformation: new insight into the role of the nucleotide in domain closure. , 2001, Journal of molecular biology.

[195]  Christian von Mering,et al.  STRING 8—a global view on proteins and their functional interactions in 630 organisms , 2008, Nucleic Acids Res..

[196]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[197]  P. Tompa,et al.  The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. , 2005, Journal of molecular biology.

[198]  Junjun Zhang,et al.  BioMart Central Portal—unified access to biological data , 2009, Nucleic Acids Res..

[199]  Samuel Karlin,et al.  Protein length in eukaryotic and prokaryotic proteomes , 2005, Nucleic acids research.

[200]  Johannes C. Hermann,et al.  Structure-based activity prediction for an enzyme of unknown function , 2007, Nature.

[201]  E. Androphy,et al.  Interaction of papillomavirus E6 oncoproteins with a putative calcium-binding protein. , 1995, Science.

[202]  Tu Minh Phuong,et al.  Multiple alignment of protein sequences with repeats and rearrangements , 2006, Nucleic acids research.

[203]  Peter J. van der Spek,et al.  TF Target Mapper: A BLAST search tool for the identification of Transcription Factor target genes , 2006, BMC Bioinformatics.

[204]  Y. Martin,et al.  A general and fast scoring function for protein-ligand interactions: a simplified potential approach. , 1999, Journal of medicinal chemistry.

[205]  G. Superti-Furga,et al.  Structural Basis for the Autoinhibition of c-Abl Tyrosine Kinase , 2003, Cell.

[206]  D. Goodsell,et al.  Automated docking to multiple target structures: Incorporation of protein mobility and structural water heterogeneity in AutoDock , 2002, Proteins.

[207]  David S. Goodsell,et al.  Automated docking of ligands to an artificial active site: augmenting crystallographic analysis with computer modeling , 2003, J. Comput. Aided Mol. Des..

[208]  A. E. Hirsh,et al.  Functional genomic analysis of the rates of protein evolution. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[209]  Jeffrey Skolnick,et al.  Protein structure prediction by pro-Sp3-TASSER. , 2009, Biophysical journal.

[210]  Minoru Kanehisa,et al.  The KEGG database. , 2002, Novartis Foundation symposium.

[211]  E. Bremer,et al.  Crystal Structures of the Choline/Acetylcholine Substrate-binding Protein ChoX from Sinorhizobium meliloti in the Liganded and Unliganded-Closed States* , 2008, Journal of Biological Chemistry.

[212]  Werner Braun,et al.  InterProSurf: a web server for predicting interacting sites on protein surfaces , 2007, Bioinform..

[213]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[214]  Chuong B. Do,et al.  ProbCons: Probabilistic consistency-based multiple sequence alignment. , 2005, Genome research.

[215]  Kris Popendorf,et al.  Accurate identification of orthologous segments among multiple genomes , 2009, Bioinform..

[216]  Robert C. Edgar,et al.  Multiple sequence alignment. , 2006, Current opinion in structural biology.

[217]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[218]  Desmond G. Higgins,et al.  Analysis and Comparison of Benchmarks for Multiple Sequence Alignment , 2006, Silico Biol..

[219]  N. Gray,et al.  N-Myristoylated c-Abl Tyrosine Kinase Localizes to the Endoplasmic Reticulum upon Binding to an Allosteric Inhibitor* , 2009, The Journal of Biological Chemistry.

[220]  Geoffrey J. Barton,et al.  The Jalview Java alignment editor , 2004, Bioinform..

[221]  Desmond J. Higham,et al.  Geometric De-noising of Protein-Protein Interaction Networks , 2009, PLoS Comput. Biol..

[222]  Mikael Bodén,et al.  MEME Suite: tools for motif discovery and searching , 2009, Nucleic Acids Res..

[223]  Andrzej Kloczkowski,et al.  Distance matrix-based approach to protein structure prediction , 2009, Journal of Structural and Functional Genomics.

[224]  Gang Li,et al.  Discovering multiple realistic TFBS motifs based on a generalized model , 2009, BMC Bioinformatics.

[225]  M. Tommasino,et al.  HPV16 E6 natural variants exhibit different activities in functional assays relevant to the carcinogenic potential of E6. , 2006, Virology.

[226]  David Baker,et al.  Macromolecular modeling with rosetta. , 2008, Annual review of biochemistry.

[227]  M Hendlich,et al.  LIGSITE: automatic and efficient detection of potential small molecule-binding sites in proteins. , 1997, Journal of molecular graphics & modelling.

[228]  Imre G. Csizmadia,et al.  Peptide and protein folding , 2001 .

[229]  John J Irwin,et al.  Predicting substrates by docking high-energy intermediates to enzyme structures. , 2006, Journal of the American Chemical Society.

[230]  M Rarey,et al.  Detailed analysis of scoring functions for virtual screening. , 2001, Journal of medicinal chemistry.

[231]  E. Weiss,et al.  Targetting of the N‐terminal domain of the human papillomavirus type 16 E6 oncoprotein with monomeric scFvs blocks the E6‐mediated degradation of cellular p53 , 1999, Journal of molecular recognition : JMR.

[232]  Marina G. Sergeeva,et al.  Orthofocus: Program for Identification of orthologs in Multiple genomes in Family-Focused Studies , 2008, J. Bioinform. Comput. Biol..

[233]  René Thomsen,et al.  MolDock: a new technique for high-accuracy molecular docking. , 2006, Journal of medicinal chemistry.