EvoCor: a platform for predicting functionally related genes using phylogenetic and expression profiles

The wealth of publicly available gene expression and genomic data provides unique opportunities for computational inference to discover groups of genes that function to control specific cellular processes. Such genes are likely to have co-evolved and be expressed in the same tissues and cells. Unfortunately, the expertise and computational resources required to compare tens of genomes and gene expression data sets make this type of analysis difficult for the average end-user. Here, we describe the implementation of a web server that predicts genes involved in affecting specific cellular processes together with a gene of interest. We termed the server ‘EvoCor’, to denote that it detects functional relationships among genes through evolutionary analysis and gene expression correlation. This web server integrates profiles of sequence divergence derived by a Hidden Markov Model (HMM) and tissue-wide gene expression patterns to determine putative functional linkages between pairs of genes. This server is easy to use and freely available at http://pilot-hmm.vbi.vt.edu/.

[1]  T. Miyakawa,et al.  Genomic responses in mouse models poorly mimic human inflammatory diseases , 2013 .

[2]  E. Lander Initial impact of the sequencing of the human genome , 2011, Nature.

[3]  A. Su,et al.  Expression analysis of G Protein-Coupled Receptors in mouse macrophages , 2008, Immunome research.

[4]  D. Eisenberg,et al.  Use of Logic Relationships to Decipher Protein Network Organization , 2004, Science.

[5]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[6]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[7]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[8]  R. Bryson-Richardson,et al.  The genetics of vertebrate myogenesis , 2008, Nature Reviews Genetics.

[9]  Sean R. Eddy,et al.  Accelerated Profile HMM Searches , 2011, PLoS Comput. Biol..

[10]  V. Mootha,et al.  Integrative genomics identifies MCU as an essential component of the mitochondrial calcium uniporter , 2011, Nature.

[11]  R. Gamelli,et al.  Genomic responses in mouse models poorly mimic human inflammatory diseases , 2013, Proceedings of the National Academy of Sciences.

[12]  Richard Hughey,et al.  Hidden Markov models for detecting remote protein homologies , 1998, Bioinform..

[13]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[14]  Yibo Wu,et al.  GOSemSim: an R package for measuring semantic similarity among GO terms and gene products , 2010, Bioinform..

[15]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Damian Szklarczyk,et al.  STRING v9.1: protein-protein interaction networks, with increased coverage and integration , 2012, Nucleic Acids Res..

[17]  E. Marcotte,et al.  Prioritizing candidate disease genes by network-based boosting of genome-wide association data. , 2011, Genome research.

[18]  Robert Hoffmann,et al.  Temporal patterns of genes in scientific publications , 2007, Proceedings of the National Academy of Sciences.

[19]  Charles DeLisi,et al.  Predictome: a database of putative functional links between proteins , 2002, Nucleic Acids Res..

[20]  S. Carr,et al.  A Mitochondrial Protein Compendium Elucidates Complex I Disease Biology , 2008, Cell.

[21]  Mark Pagel,et al.  Predicting Functional Gene Links from Phylogenetic-Statistical Analyses of Whole Genomes , 2005, 2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05).