Theoretical Bounds for the Number of Inferable Edges in Sparse Random Networks

The inference of a network structure from experimental data providing dynamical information about the underlying system of investigation is an important and still outstanding problem if the number of nodes within a network is not small. For example, high-throughput data from gene networks of, e.g., metabolic, signaling or transcriptional regulatory networks, provide information of thousands of genes or products thereof. Theoretically, the graph-theoretical measure dseparation provides a criteria to recover the network structure edge-by-edge by calculating the partial correlation. However, practically, for large networks it is not possible to estimate the partial correlation up to an arbitrary order because the number of possible d-separating sets grows exponential with the number of nodes in the network. In this paper, we determine numerically theoretical bounds for the number of inferable edges in directed (possible cyclic) sparse random networks if the maximal size of d-separating sets is restricted to nmax. Under ideal experimental conditions these bounds correspond to the maximal precision an unknown network structure can be recovered utilizing partial correlation of order up to nmax.

[1]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[2]  Alberto de la Fuente,et al.  Discovery of meaningful associations in genomic data using partial correlation coefficients , 2004, Bioinform..

[3]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[4]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[5]  B. Snel,et al.  The yeast coexpression network has a small‐world, scale‐free architecture and can be explained by a simple model , 2004, EMBO reports.

[6]  Hierarchical Organization of Modularity in Metabolic Networks Supporting Online Material , 2002 .

[7]  P. Spirtes,et al.  Using Path Diagrams as a Structural Equation Modeling Tool , 1998 .

[8]  Petter Holme,et al.  Subnetwork hierarchies of biochemical pathways , 2002, Bioinform..

[9]  Paul M. Magwene,et al.  Estimating genomic coexpression networks using first-order conditional independence , 2004, Genome Biology.

[10]  P. Spirtes,et al.  An Algorithm for Fast Recovery of Sparse Causal Graphs , 1991 .

[11]  P. Bühlmann,et al.  Statistical Applications in Genetics and Molecular Biology Low-Order Conditional Independence Graphs for Inferring Genetic Networks , 2011 .

[12]  Adam A. Margolin,et al.  Reverse engineering of regulatory networks in human B cells , 2005, Nature Genetics.

[13]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[14]  Judea Pearl,et al.  Equivalence and Synthesis of Causal Models , 1990, UAI.

[15]  Judea Pearl,et al.  Causal networks: semantics and expressiveness , 2013, UAI.

[16]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[17]  A. Rapoport,et al.  Connectivity of random nets , 1951 .

[18]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[19]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[20]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[21]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.