Algorithms and complexity of enumerating minimal precursor sets in genome-wide metabolic networks

MOTIVATION In the context of studying whole metabolic networks and their interaction with the environment, the following question arises: given a set of target metabolites T and a set of possible external source metabolites , which are the minimal subsets of that are able to produce all the metabolites in T. Such subsets are called the minimal precursor sets of T. The problem is then whether we can enumerate all of them efficiently. RESULTS We propose a new characterization of precursor sets as the inputs of reaction sets called factories and an efficient algorithm to decide if a set of sources is precursor set of T. We show proofs of hardness for the problems of finding a precursor set of minimum size and of enumerating all minimal precursor sets T. We propose two new algorithms which, despite the hardness of the enumeration problem, allow to enumerate all minimal precursor sets in networks with up to 1000 reactions. AVAILABILITY Source code and datasets used in our benchmarks are freely available for download at http://sites.google.com/site/pitufosoftware/download. CONTACT vicente77@gmail.com, pvmilreu@gmail.com or marie-france.sagot@inria.fr.

[1]  N. Moran,et al.  Parallel genomic evolution and metabolic interdependence in an ancient symbiosis , 2007, Proceedings of the National Academy of Sciences.

[2]  Leen Stougie,et al.  Graph-Based Analysis of the Metabolic Exchanges between Two Co-Resident Intracellular Symbionts, Baumannia cicadellinicola and Sulcia muelleri, with Their Insect Host, Homalodisca coagulata , 2010, PLoS Comput. Biol..

[3]  Ran Raz,et al.  A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP , 1997, STOC '97.

[4]  P R Romero,et al.  Nutrient-related analysis of pathway/genome databases. , 2001, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[5]  Michael P. Barrett,et al.  MetExplore: a web server to link metabolomic experiments and genome-scale metabolic networks , 2010, Nucleic Acids Res..

[6]  Adam M. Feist,et al.  Reconstruction of biochemical networks in microorganisms , 2009, Nature Reviews Microbiology.

[7]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[8]  Leen Stougie,et al.  Enumerating Precursor Sets of Target Metabolites in a Metabolic Network , 2008, WABI.

[9]  Giorgio Gambosi,et al.  Complexity and approximation: combinatorial optimization problems and their approximability properties , 1999 .

[10]  Peter D. Karp,et al.  Nutrition-Related Analysis of Pathway/Genome Databases , 2001, Pacific Symposium on Biocomputing.

[11]  Vladimir Gurvich,et al.  On Generating the Irredundant Conjunctive and Disjunctive Normal Forms of Monotone Boolean Functions , 1999, Discret. Appl. Math..

[12]  Mihalis Yannakakis,et al.  On Generating All Maximal Independent Sets , 1988, Inf. Process. Lett..

[13]  B. Palsson,et al.  A protocol for generating a high-quality genome-scale metabolic reconstruction , 2010 .