Network inference through synergistic subnetwork evolution

Study of signaling networks is important for a better understanding of cell behaviors e.g., growth, differentiation, metabolism, proptosis, and gaining deeper insights into the molecular mechanisms of complex diseases. While there have been many successes in developing computational approaches for identifying potential genes and proteins involved in cell signaling, new methods are needed for identifying network structures that depict underlying signal cascading mechanisms. In this paper, we propose a new computational approach for inferring signaling network structures from overlapping gene sets related to the networks. In the proposed approach, a signaling network is represented as a directed graph and is viewed as a union of many active paths representing linear and overlapping chains of signal cascading activities in the network. Gene sets represent the sets of genes participating in active paths without prior knowledge of the order in which genes occur within each path. From a compendium of unordered gene sets, the proposed algorithm reconstructs the underlying network structure through evolution of synergistic active paths. In our context, the extent of edge overlapping among active paths is used to define the synergy present in a network. We evaluated the performance of the proposed algorithm in terms of its convergence and recovering true active paths by utilizing four gene set compendiums derived from the KEGG database. Evaluation of results demonstrate the ability of the algorithm in reconstructing the underlying networks with high accuracy and precision.

[1]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[2]  O. Nelles,et al.  An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[3]  Wolfgang Banzhaf,et al.  Genetic Programming: An Introduction , 1997 .

[4]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[5]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[6]  Benno Schwikowski,et al.  Discovering regulatory and signalling circuits in molecular interaction networks , 2002, ISMB.

[7]  Edward R. Dougherty,et al.  Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks , 2002, Bioinform..

[8]  J. Collins,et al.  Inferring Genetic Networks and Identifying Compound Mode of Action via Expression Profiling , 2003, Science.

[9]  Edward R. Dougherty,et al.  Steady-State Analysis of Genetic Regulatory Networks Modelled by Probabilistic Boolean Networks , 2003, Comparative and functional genomics.

[10]  J. Hasty,et al.  Reverse engineering gene networks: Integrating genetic perturbations with dynamical modeling , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[11]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[12]  Philip M. Kim,et al.  Subsystem identification through dimensionality reduction of large-scale gene expression data. , 2003, Genome research.

[13]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[14]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[15]  Nir Friedman,et al.  Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks , 2004, Machine Learning.

[16]  M. West,et al.  Sparse graphical models for exploring gene expression data , 2004 .

[17]  Adam A. Margolin,et al.  Reverse engineering of regulatory networks in human B cells , 2005, Nature Genetics.

[18]  Alfred O. Hero,et al.  Network constrained clustering for gene microarray data , 2005, Bioinform..

[19]  P. Park,et al.  Discovering statistically significant pathways in expression profiling studies. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Alfred O. Hero,et al.  Network constrained clustering for gene microarray data , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[21]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .

[22]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..

[24]  Alfred O. Hero,et al.  High Throughput Screening of Co-Expressed Gene Pairs with Controlled False Discovery Rate (FDR) and Minimum Acceptable Strength (MAS) , 2005, J. Comput. Biol..

[25]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[26]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[27]  Kevin Kontos,et al.  Information-Theoretic Inference of Large Transcriptional Regulatory Networks , 2007, EURASIP J. Bioinform. Syst. Biol..

[28]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[29]  Gianluca Bontempi,et al.  minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information , 2008, BMC Bioinformatics.

[30]  Robert D. Nowak,et al.  Network Inference From Co-Occurrences , 2006, IEEE Transactions on Information Theory.

[31]  Edwin K. P. Chong,et al.  An Introduction to Optimization: Chong/An Introduction , 2008 .

[32]  Joaquín Dopazo,et al.  Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies , 2009, Nucleic Acids Res..

[33]  Lars Kaderali,et al.  Reconstructing signaling pathways from RNAi data using probabilistic Boolean threshold networks , 2009, Bioinform..

[34]  M. Girolami,et al.  Inferring Signaling Pathway Topologies from Multiple Perturbation Measurements of Specific Biochemical Species , 2010, Science Signaling.

[35]  Olga G. Troyanskaya,et al.  Simultaneous Genome-Wide Inference of Physical, Genetic, Regulatory, and Functional Pathway Components , 2010, PLoS Comput. Biol..

[36]  Alfonso Valencia,et al.  TopoGSA: network topological gene set analysis , 2010, Bioinform..

[37]  Hua Li,et al.  Improved Bayesian Network inference using relaxed gene ordering , 2010, Int. J. Data Min. Bioinform..

[38]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[39]  Dongxiao Zhu,et al.  Optimal structural inference of signaling pathways from unordered and overlapping gene sets , 2012, Bioinform..

[40]  Zhansheng Duan,et al.  GSGS: A Computational Approach to Reconstruct Signaling Pathway Structures from Gene Sets , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[41]  Z. Bar-Joseph,et al.  Linking the signaling cascades and dynamic regulatory networks controlling stress responses , 2013, Genome research.

[42]  Tamer Kahveci,et al.  Large scale analysis of signal reachability , 2014, Bioinform..