Computational protein function predictions.

What is the role of this protein? Where does this protein localize in a cell? Are there any ligands that bind to this protein? If so, what are they? Which residues constitute a functional site of the protein? These questions, which in a broader sense seek the biological function of a protein, are fundamental and central in modern biology. Ultimately, the biological function of a protein needs to be determined by experiments. However, a hypothesis is needed to design an assay because it determines whether a target protein has a particular function or not. Biologists come up with hypotheses of protein function from circumstantial evidence, and computational function prediction can play an important part. Computational function prediction methods are also useful for analyzing protein function in a proteomic scale since methods can be applied to a large number of proteins in a realistic time. As more protein function annotations accumulate in various databases , and algorithms advance in the machine learning field, computational protein function prediction methods have become more accurate and reliable in recent years. Moreover, it is noticeable that various different types of predictions have emerged, which also indicates the maturity of the field. Features of proteins that can be used for function prediction ranges from conventional sequence information, structures, to networks of protein associations. To capture the current landscape of the quite diverse field of computational protein function prediction, this issue collected state-of-the-art function prediction methods of different types. The first three articles [1–3] describes sequence-based methods. The first paper by the Tian group describes their sequence-based function prediction method named GOFDR [1]. GOFDR takes query protein sequence as input, and predicts Gene Ontology (GO) terms for the query from similar sequences to the query that are retrieved from a sequence database, which is similar per se to other existing sequence-based methods. A notable device in GOFDR is that it considers residues that are specific for a particular GO term in transferring the GO term to the query. Argot2.5 by the Toppo group retrieves similar sequences to a query from databases by two methods, BLAST and HMMER3, a hidden Markov Model-based tool, and takes GO terms from the retrieved sequences [2]. An interesting idea implemented in Argot2.5 is that it considers taxon-omy information of sequences, namely, GO terms that seem incompatible with the taxon of a sequence are filtered out. The next article by Das and Orengo …

[1]  Christine Brun,et al.  Integration of quantitative proteomics data and interaction networks: Identification of dysregulated cellular functions during cancer progression. , 2016, Methods.

[2]  Michal Brylinski,et al.  Template-based identification of protein-protein interfaces using eFindSitePPI. , 2016, Methods.

[3]  Daisuke Kihara,et al.  PatchSurfers: Two methods for local molecular property-based binding ligand prediction. , 2016, Methods.

[4]  Zhonghao Liu,et al.  Mislocalization-related disease gene discovery using gene expression based computational protein localization prediction. , 2016, Methods.

[5]  Mateusz Kurcinski,et al.  Modeling of protein-peptide interactions using the CABS-dock web server for binding site search and flexible docking. , 2015, Methods.

[6]  Weidong Tian,et al.  GoFDR: A sequence alignment based method for predicting protein functions. , 2016, Methods.

[7]  Shuli Kang,et al.  Pushing the annotation of cellular activities to a higher resolution: Predicting functions at the isoform level. , 2016, Methods.

[8]  Gaurav Pandey,et al.  Predicting protein function and other biomedical characteristics with heterogeneous ensembles. , 2016, Methods.

[9]  Sayoni Das,et al.  Protein function annotation using protein domain family resources. , 2016, Methods.

[10]  Mary Jo Ondrechen,et al.  Local structure based method for prediction of the biochemical function of proteins: Applications to glycoside hydrolases. , 2016, Methods.

[11]  Renzhi Cao,et al.  Integrated protein function prediction by mining function associations, sequences, and protein-protein and gene-gene interaction networks. , 2016, Methods.

[12]  Kentaro Tomii,et al.  Protein ligand-binding site comparison by a reduced vector representation derived from multidimensional scaling of generalized description of binding sites. , 2016, Methods.

[13]  Stefano Toppo,et al.  Enhancing protein function prediction with taxonomic constraints--The Argot2.5 web server. , 2016, Methods.