How can functional annotations be derived from profiles of phenotypic annotations?

BackgroundLoss-of-function phenotypes are widely used to infer gene function using the principle that similar phenotypes are indicative of similar functions. However, converting phenotypic to functional annotations requires careful interpretation of phenotypic descriptions and assessment of phenotypic similarity. Understanding how functions and phenotypes are linked will be crucial for the development of methods for the automatic conversion of gene loss-of-function phenotypes to gene functional annotations.ResultsWe explored the relation between cellular phenotypes from RNAi-based screens in human cells and gene annotations of cellular functions as provided by the Gene Ontology (GO). Comparing different similarity measures, we found that information content-based measures of phenotypic similarity were the best at capturing gene functional similarity. However, phenotypic similarities did not map to the Gene Ontology organization of gene function but to functions defined as groups of GO terms with shared gene annotations.ConclusionsOur observations have implications for the use and interpretation of phenotypic similarities as a proxy for gene functions both in RNAi screen data analysis and curation and in the prediction of disease genes.

[1]  Dmitrij Frishman,et al.  Negatome 2.0: a database of non-interacting proteins derived by literature mining, manual annotation and protein structure analysis , 2013, Nucleic Acids Res..

[2]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[3]  M. Boutros,et al.  Clustering phenotype populations by genome-wide RNAi and multiparametric imaging , 2010, Molecular systems biology.

[4]  M. Boutros,et al.  A map of directional genetic interactions in a metazoan cell , 2015, eLife.

[5]  Staffan Strömblad,et al.  Systems microscopy: an emerging strategy for the life sciences. , 2010, Experimental cell research.

[6]  Purvesh Khatri,et al.  Onto-Tools: an ensemble of web-accessible, ontology-based tools for the functional design and interpretation of high-throughput gene expression experiments , 2004, Nucleic Acids Res..

[7]  Wolfgang Huber,et al.  Mapping genetic interactions in human cancer cells with RNAi and multiparametric phenotyping , 2013, Nature Methods.

[8]  J. Ellenberg,et al.  Nucleoporin NUP153 guards genome integrity by promoting nuclear import of 53BP1 , 2011, Cell Death and Differentiation.

[9]  P. Gönczy,et al.  Discovering regulators of centriole biogenesis through siRNA-based functional genomics in human cells. , 2013, Developmental cell.

[10]  Xavier Robin,et al.  pROC: an open-source package for R and S+ to analyze and compare ROC curves , 2011, BMC Bioinformatics.

[11]  Alfonso Valencia,et al.  Integration of biological data by kernels on graph nodes allows prediction of new genes involved in mitotic chromosome condensation , 2014, Molecular biology of the cell.

[12]  Ian M. Donaldson,et al.  iRefIndex: A consolidated protein interaction database with provenance , 2008, BMC Bioinformatics.

[13]  L. Stein,et al.  Annotating Cancer Variants and Anti-Cancer Therapeutics in Reactome , 2012, Cancers.

[14]  Henning Hermjakob,et al.  The Reactome pathway knowledgebase , 2013, Nucleic Acids Res..

[15]  Ulf Leser,et al.  Mining phenotypes for gene function prediction , 2008, BMC Bioinformatics.

[16]  Robert Hoehndorf,et al.  Semantic integration of physiology phenotypes with an application to the Cellular Phenotype Ontology , 2012, Bioinform..

[17]  Peter Dayan,et al.  Computational Phenotyping of Two-Person Interactions Reveals Differential Neural Response to Depth-of-Thought , 2012, PLoS Comput. Biol..

[18]  Xin Wang,et al.  Posterior Association Networks and Functional Modules Inferred from Rich Phenotypes of Gene Perturbations , 2012, PLoS Comput. Biol..

[19]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[20]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[21]  Stefan Wiemann,et al.  Genome-wide RNAi screening identifies human proteins with a regulatory function in the early secretory pathway , 2012, Nature Cell Biology.

[22]  Lani F. Wu,et al.  Image-based multivariate profiling of drug responses from single cells , 2007, Nature Methods.

[23]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[24]  Michelle Girvan,et al.  Finding New Order in Biological Functions from the Network Structure of Gene Annotations , 2012, PLoS Comput. Biol..

[25]  David Warde-Farley,et al.  GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function , 2008, Genome Biology.

[26]  S. Dumais Latent Semantic Analysis. , 2005 .

[27]  Rafael C. Jimenez,et al.  The MIntAct project—IntAct as a common curation platform for 11 molecular interaction databases , 2013, Nucleic Acids Res..

[28]  Thomas Horn,et al.  GenomeRNAi: a database for cell-based and in vivo RNAi phenotypes, 2013 update , 2012, Nucleic Acids Res..

[29]  Mario Cannataro,et al.  Semantic similarity analysis of protein data: assessment with biological features and issues , 2012, Briefings Bioinform..

[30]  Leonardo G. Trabuco,et al.  Negative protein-protein interaction datasets derived from large-scale two-hybrid experiments. , 2012, Methods.

[31]  Stephen E. Robertson,et al.  Understanding inverse document frequency: on theoretical arguments for IDF , 2004, J. Documentation.

[32]  R. Durbin,et al.  Phenotypic profiling of the human genome by time-lapse microscopy reveals cell division genes , 2010, Nature.

[33]  Alvis Brazma,et al.  Cellular phenotype database: a repository for systems microscopy data , 2015, Bioinform..

[34]  Alfonso Valencia,et al.  FUN-L: gene prioritization for RNAi screens , 2015, Bioinform..

[35]  R. Sharan,et al.  Network-based prediction of protein function , 2007, Molecular systems biology.

[36]  R. Wollman,et al.  A genome-wide siRNA screen reveals diverse cellular processes and pathways that mediate genome stability. , 2009, Molecular cell.

[37]  Kristin C. Gunsalus,et al.  RNAiDB and PhenoBlast: web tools for genome-wide phenotypic mapping projects , 2004, Nucleic Acids Res..

[38]  C. Bakal,et al.  Quantitative Morphological Signatures Define Local Signaling Networks Regulating Cell Morphology , 2007, Science.

[39]  PagelPhilipp,et al.  The MIPS mammalian protein--protein interaction database , 2005 .

[40]  Wolfgang Huber,et al.  A genetic interaction map of cell cycle regulators , 2016, Molecular biology of the cell.

[41]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[42]  Hai Fang,et al.  The ‘dnet’ approach promotes emerging research on cancer patient survival , 2014, Genome Medicine.

[43]  Simon Jupp,et al.  The cellular microscopy phenotype ontology , 2016, Journal of Biomedical Semantics.

[44]  Jianbo Shi,et al.  A Random Walks View of Spectral Segmentation , 2001, AISTATS.