A new computational method to split large biochemical networks into coherent subnets

BackgroundCompared to more general networks, biochemical networks have some special features: while generally sparse, there are a small number of highly connected metabolite nodes; and metabolite nodes can also be divided into two classes: internal nodes with associated mass balance constraints and external ones without. Based on these features, reclassifying selected internal nodes (separators) to external ones can be used to divide a large complex metabolic network into simpler subnetworks. Selection of separators based on node connectivity is commonly used but affords little detailed control and tends to produce excessive fragmentation.The method proposed here (Netsplitter) allows the user to control separator selection. It combines local connection degree partitioning with global connectivity derived from random walks on the network, to produce a more even distribution of subnetwork sizes. Partitioning is performed progressively and the interactive visual matrix presentation used allows the user considerable control over the process, while incorporating special strategies to maintain the network integrity and minimise the information loss due to partitioning.ResultsPartitioning of a genome scale network of 1348 metabolites and 1468 reactions for Arabidopsis thaliana encapsulates 66% of the network into 10 medium sized subnets. Applied to the flavonoid subnetwork extracted in this way, it is shown that Netsplitter separates this naturally into four subnets with recognisable functionality, namely synthesis of lignin precursors, flavonoids, coumarin and benzenoids. A quantitative quality measure called efficacy is constructed and shows that the new method gives improved partitioning for several metabolic networks, including bacterial, plant and mammal species.ConclusionsFor the examples studied the Netsplitter method is a considerable improvement on the performance of connection degree partitioning, giving a better balance of subnet sizes with the removal of fewer mass balance constraints. In addition, the user can interactively control which metabolite nodes are selected for cutting and when to stop further partitioning as the desired granularity has been reached. Finally, the blocking transformation at the heart of the procedure provides a powerful visual display of network structure that may be useful for its exploration independent of whether partitioning is required.

[1]  Thomas Pfeiffer,et al.  Exploring the pathway structure of metabolism: decomposition into subnetworks and application to Mycoplasma pneumoniae , 2002, Bioinform..

[2]  An-Ping Zeng,et al.  Decomposition of metabolic network into functional modules based on the global connectivity structure of reaction graph , 2004, Bioinform..

[3]  Robert J. Plemmons,et al.  Nonnegative Matrices in the Mathematical Sciences , 1979, Classics in Applied Mathematics.

[4]  Michael Hecker,et al.  Integrated network reconstruction, visualization and analysis using YANAsquare , 2007, BMC Bioinformatics.

[5]  Stefan Schuster,et al.  YANA – a software tool for analyzing flux modes, gene-expression and enzyme activities , 2005, BMC Bioinformatics.

[6]  Christoph Kaleta,et al.  Response to comment on 'Can sugars be produced from fatty acids? A test case for pathway analysis tools' , 2009, Bioinform..

[7]  Jungwon Yoon,et al.  The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community , 2003, Nucleic Acids Res..

[8]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[9]  J. Stelling,et al.  Combinatorial Complexity of Pathway Analysis in Metabolic Networks , 2004, Molecular Biology Reports.

[10]  Lake-Ee Quek,et al.  On the reconstruction of the Mus musculus genome-scale metabolic network model. , 2008, Genome informatics. International Conference on Genome Informatics.

[11]  Petter Holme,et al.  Subnetwork hierarchies of biochemical pathways , 2002, Bioinform..

[12]  B. Palsson Systems Biology: Properties of Reconstructed Networks , 2006 .

[13]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[14]  Peter C. Jurs,et al.  Mathematica , 2019, J. Chem. Inf. Comput. Sci..

[15]  Angel Rubio,et al.  Computing the shortest elementary flux modes in genome-scale metabolic networks , 2009, Bioinform..

[16]  S. Schuster,et al.  Can the whole be less than the sum of its parts? Pathway analysis in genome-scale metabolic networks using elementary flux patterns. , 2009, Genome research.

[17]  Bas Teusink,et al.  Understanding the Adaptive Growth Strategy of Lactobacillus plantarum by In Silico Optimisation , 2009, PLoS Comput. Biol..

[18]  Julien Gagneur,et al.  Hierarchical Analysis of Dependency in Metabolic Networks , 2003, Bioinform..

[19]  R. Guimerà,et al.  Functional cartography of complex metabolic networks , 2005, Nature.

[20]  Wynand S. Verwoerd Dissecting Metabolic Networks into Functional Subnets , 2010 .

[21]  P. Bork,et al.  Impact of Genome Reduction on Bacterial Metabolism and Its Regulation , 2009, Science.

[22]  Christoph Kaleta,et al.  Metabolic Pathway Analysis : from small to genome-scale networks , 2011 .

[23]  Jörg Stelling,et al.  Large-scale computation of elementary flux modes with bit pattern trees , 2008, Bioinform..

[24]  Rajat K. De,et al.  An algorithm for modularization of MAPK and calcium signaling pathways: Comparative analysis among different species , 2007, J. Biomed. Informatics.

[25]  Steffen Klamt,et al.  Hypergraphs and Cellular Networks , 2009, PLoS Comput. Biol..

[26]  Eytan Ruppin,et al.  Metabolic reconstruction, constraint-based analysis and game theory to probe genome-scale metabolic networks. , 2010, Current opinion in biotechnology.