Predicting functional associations from metabolism using bi-partite network algorithms

BackgroundMetabolic reconstructions contain detailed information about metabolic enzymes and their reactants and products. These networks can be used to infer functional associations between metabolic enzymes. Many methods are based on the number of metabolites shared by two enzymes, or the shortest path between two enzymes. Metabolite sharing can miss associations between non-consecutive enzymes in a serial pathway, and shortest-path algorithms are sensitive to high-degree metabolites such as water and ATP that create connections between enzymes with little functional similarity.ResultsWe present new, fast methods to infer functional associations in metabolic networks. A local method, the degree-corrected Poisson score, is based only on the metabolites shared by two enzymes, but uses the known metabolite degree distribution. A global method, based on graph diffusion kernels, predicts associations between enzymes that do not share metabolites. Both methods are robust to high-degree metabolites. They out-perform previous methods in predicting shared Gene Ontology (GO) annotations and in predicting experimentally observed synthetic lethal genetic interactions. Including cellular compartment information improves GO annotation predictions but degrades synthetic lethal interaction prediction. These new methods perform nearly as well as computationally demanding methods based on flux balance analysis.ConclusionsWe present fast, accurate methods to predict functional associations from metabolic networks. Biological significance is demonstrated by identifying enzymes whose strong metabolic correlations are missed by conventional annotations in GO, most often enzymes involved in transport vs. synthesis of the same metabolite or other enzyme pairs that share a metabolite but are separated by conventional pathway boundaries. More generally, the methods described here may be valuable for analyzing other types of networks with long-tailed degree distributions and high-degree hubs.

[1]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[2]  Markus J. Herrgård,et al.  Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. , 2004, Genome research.

[3]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[4]  Jan Ihmels,et al.  Principles of transcriptional control in the metabolic network of Saccharomyces cerevisiae , 2004, Nature Biotechnology.

[5]  B. Palsson,et al.  Genome-scale models of microbial cells: evaluating the consequences of constraints , 2004, Nature Reviews Microbiology.

[6]  Bernhard O. Palsson,et al.  Metabolite coupling in genome-scale metabolic networks , 2006, BMC Bioinformatics.

[7]  Ronan M. T. Fleming,et al.  Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0 , 2007, Nature Protocols.

[8]  J. Bader,et al.  Finding friends and enemies in an enemies-only network: a graph diffusion kernel for predicting novel genetic interactions and co-complex membership from yeast genetic interactions. , 2008, Genome research.

[9]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[10]  Joel S. Bader,et al.  Metabolic Flux Correlations, Genetic Interactions, and Disease , 2009, J. Comput. Biol..

[11]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[12]  G. Church,et al.  Expression dynamics of a cellular metabolic network , 2005, Molecular systems biology.

[13]  William Stafford Noble,et al.  Learning kernels from biological networks by maximizing entropy , 2004, ISMB/ECCB.

[14]  S. Oliver,et al.  Plasticity of genetic interactions in metabolic networks of yeast , 2007, Proceedings of the National Academy of Sciences.

[15]  Ronan M. T. Fleming,et al.  Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0 , 2007, Nature Protocols.

[16]  Bas Teusink,et al.  Co-Regulation of Metabolic Genes Is Better Explained by Flux Coupling Than by Network Distance , 2008, PLoS Comput. Biol..

[17]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[18]  G. Church,et al.  Modular epistasis in yeast metabolism , 2005, Nature Genetics.

[19]  Bernhard O. Palsson,et al.  BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions , 2010, BMC Bioinformatics.

[20]  Roded Sharan,et al.  Constraint-based functional similarity of metabolic genes: going beyond network topology , 2007, Bioinform..

[21]  Krin A. Kay,et al.  The implications of human metabolic network topology for disease comorbidity , 2008, Proceedings of the National Academy of Sciences.