Assessing the Exceptionality of Coloured Motifs in Networks

Various methods have been recently employed to characterise the structure of biological networks. In particular, the concept of network motif and the related one of coloured motif have proven useful to model the notion of a functional/evolutionary building block. However, algorithms that enumerate all the motifs of a network may produce a very large output, and methods to decide which motifs should be selected for downstream analysis are needed. A widely used method is to assess if the motif is exceptional, that is, over- or under-represented with respect to a null hypothesis. Much effort has been put in the last thirty years to derive -values for the frequencies of topological motifs, that is, fixed subgraphs. They rely either on (compound) Poisson and Gaussian approximations for the motif count distribution in Erdös-Rényi random graphs or on simulations in other models. We focus on a different definition of graph motifs that corresponds to coloured motifs. A coloured motif is a connected subgraph with fixed vertex colours but unspecified topology. Our work is the first analytical attempt to assess the exceptionality of coloured motifs in networks without any simulation. We first establish analytical formulae for the mean and the variance of the count of a coloured motif in an Erdös-Rényi random graph model. Using simulations under this model, we further show that a Pólya-Aeppli distribution better approximates the distribution of the motif count compared to Gaussian or Poisson distributions. The Pólya-Aeppli distribution, and more generally the compound Poisson distributions, are indeed well designed to model counts of clumping events. Altogether, these results enable to derive a -value for a coloured motif, without spending time on simulations.

[1]  Franck Picard,et al.  Assessing the Exceptionality of Network Motifs , 2007, J. Comput. Biol..

[2]  R. Tsien,et al.  Specificity and Stability in Topology of Protein Networks , 2022 .

[3]  R. Milo,et al.  Subgraphs in random networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[5]  Svante Janson,et al.  Random graphs , 2000, Wiley-Interscience series in discrete mathematics and optimization.

[6]  Sophie Schbath,et al.  Compound Poisson approximation of word counts in DNA sequences , 1997 .

[7]  J. Davis Univariate Discrete Distributions , 2006 .

[8]  L. Amaral,et al.  Quantitative analysis of the local structure of food webs. , 2007, Journal of theoretical biology.

[9]  A. W. Kemp,et al.  Univariate Discrete Distributions , 1993 .

[10]  Michael R. Fellows,et al.  Sharp Tractability Borderlines for Finding Connected Motifs in Vertex-Colored Graphs , 2007, ICALP.

[11]  Cristina G. Fernandes,et al.  Motif Search in Graphs: Application to Metabolic Networks , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[13]  D. Fell,et al.  The small world inside large metabolic networks , 2000, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[14]  Franck Picard,et al.  A mixture model for random graphs , 2008, Stat. Comput..

[15]  Dudley Stark,et al.  Compound Poisson approximations of subgraph counts in random graphs , 2001, Random Struct. Algorithms.

[16]  Carsten Wiuf,et al.  Subnets of scale-free networks are not scale-free: sampling properties of networks. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[17]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[18]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[19]  Svante Janson,et al.  Random graphs , 2000, ZOR Methods Model. Oper. Res..

[20]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[21]  S. Ross A random graph , 1981 .