NetMODE: Network Motif Detection without Nauty

A motif in a network is a connected graph that occurs significantly more frequently as an induced subgraph than would be expected in a similar randomized network. By virtue of being atypical, it is thought that motifs might play a more important role than arbitrary subgraphs. Recently, a flurry of advances in the study of network motifs has created demand for faster computational means for identifying motifs in increasingly larger networks. Motif detection is typically performed by enumerating subgraphs in an input network and in an ensemble of comparison networks; this poses a significant computational problem. Classifying the subgraphs encountered, for instance, is typically performed using a graph canonical labeling package, such as Nauty, and will typically be called billions of times. In this article, we describe an implementation of a network motif detection package, which we call NetMODE. NetMODE can only perform motif detection for [Formula: see text]-node subgraphs when [Formula: see text], but does so without the use of Nauty. To avoid using Nauty, NetMODE has an initial pretreatment phase, where [Formula: see text]-node graph data is stored in memory ([Formula: see text]). For [Formula: see text] we take a novel approach, which relates to the Reconstruction Conjecture for directed graphs. We find that NetMODE can perform up to around [Formula: see text] times faster than its predecessors when [Formula: see text] and up to around [Formula: see text] times faster when [Formula: see text] (the exact improvement varies considerably). NetMODE also (a) includes a method for generating comparison graphs uniformly at random, (b) can interface with external packages (e.g. R), and (c) can utilize multi-core architectures. NetMODE is available from netmode.sf.net.

[1]  Franck Picard,et al.  Assessing the Exceptionality of Network Motifs , 2007, J. Comput. Biol..

[2]  Maya Paczuski,et al.  Subgraph ensembles and motif discovery using an alternative heuristic for graph isomorphism. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[4]  Stefan Bornholdt,et al.  Handbook of Graphs and Networks: From the Genome to the Internet , 2003 .

[5]  U. Alon Network motifs: theory and experimental approaches , 2007, Nature Reviews Genetics.

[6]  S. Shen-Orr,et al.  Networks Network Motifs : Simple Building Blocks of Complex , 2002 .

[7]  Arun Siddharth Konagurthu,et al.  On the origin of distribution patterns of motifs in biological networks , 2008, BMC Systems Biology.

[8]  Hu Jia-lu,et al.  A Review on Algorithms for Network Motif Discovery in Biological Networks , 2009 .

[9]  Sahar Asadi,et al.  Kavosh: a new algorithm for finding network motifs , 2009, BMC Bioinformatics.

[10]  Weiguo Liu,et al.  GPU-MEME: Using Graphics Hardware to Accelerate Motif Finding in DNA Sequences , 2008, PRIB.

[11]  Falk Schreiber,et al.  Towards Motif Detection in Networks: Frequency Concepts and Flexible Search , 2004 .

[12]  Marcus Kaiser,et al.  Strategies for Network Motifs Discovery , 2009, 2009 Fifth IEEE International Conference on e-Science.

[13]  Chun-Hsi Huang,et al.  Biological network motif detection: principles and practice , 2012, Briefings Bioinform..

[14]  Sebastian Wernicke,et al.  FANMOD: a tool for fast network motif detection , 2006, Bioinform..

[15]  Luís M. B. Lopes,et al.  A Parallel Algorithm for Counting Subgraphs in Complex Networks , 2010, BIOSTEC.

[16]  Fernando M. A. Silva,et al.  Parallel Calculation of Subgraph Census in Biological Networks , 2010, BIOINFORMATICS.

[17]  Stéphane Robin,et al.  NeMo: Fast Count of Network Motifs , 2011 .

[18]  Sebastian Wernicke,et al.  A Faster Algorithm for Detecting Network Motifs , 2005, WABI.

[19]  Sebastian Wernicke,et al.  Efficient Detection of Network Motifs , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[20]  Yun Xu,et al.  A Parallel Gibbs Sampling Algorithm for Motif Finding on GPU , 2009, 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications.

[21]  M. Newman,et al.  On the uniform generation of random graphs with prescribed degree sequences , 2003, cond-mat/0312028.

[22]  Lin Gao,et al.  Evaluation of subgraph searching algorithms detecting network motif in biological networks , 2009, Frontiers of Computer Science in China.

[23]  Yongchao Liu,et al.  CUDA-MEME: Accelerating motif discovery in biological sequences using CUDA-enabled graphics processing units , 2010, Pattern Recognit. Lett..

[24]  See-Kiong Ng,et al.  Biological Data Mining in Protein Interaction Networks , 2009 .

[25]  Fernando M. A. Silva,et al.  Efficient Parallel Subgraph Counting Using G-Tries , 2010, 2010 IEEE International Conference on Cluster Computing.

[26]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[27]  Fernando M. A. Silva,et al.  g-tries: an efficient data structure for discovering network motifs , 2010, SAC '10.

[28]  F. Schreiber,et al.  MODA: an efficient algorithm for network motif discovery in biological networks. , 2009, Genes & genetic systems.

[29]  Concettina Guerra,et al.  A review on models and algorithms for motif discovery in protein-protein interaction networks. , 2008, Briefings in functional genomics & proteomics.

[30]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[31]  Uri Alon,et al.  Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs , 2004, Bioinform..

[32]  Raymond Wan,et al.  Discovering Network Motifs in Protein Interaction Networks , 2009 .

[33]  Fernando M. A. Silva,et al.  Efficient Subgraph Frequency Estimation with G-Tries , 2010, WABI.

[34]  Falk Schreiber,et al.  Analysis of Biological Networks , 2008 .

[35]  L. Stone,et al.  Generating uniformly distributed random networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[36]  Luigi Palopoli,et al.  New Trends in Graph Mining: Structural and Node-Colored Network Motifs , 2010, Int. J. Knowl. Discov. Bioinform..

[37]  Falk Schreiber,et al.  MAVisto: a tool for the exploration of network motifs , 2005, Bioinform..

[38]  Reid Ginoza,et al.  Network motifs come in sets: correlations in the randomization process. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.