Parallel Network Motif Finding

Network motifs are over-represented patterns within a network, and signify the fundamental building blocks of that network. The process of finding network motifs is closely related to the traditional subgraph isomorphism problem in computer science, which finds instances of a particular subgraph in a graph. This problem has been proven NP-complete, and thus even for relatively small subgraphs and graphs, today’s most efficient approaches require a large amount of computation time. Here we present parallel algorithms for network motif finding, including both query parallelization, where different subgraphs are searched for in parallel, and network parallelization, where the original network is partitioned into overlapping regions, and a single subgraph is searched for in parallel among the different regions. The network partitioning algorithm is a novel application of hierarchical clustering and divides the network into strongly connected components, while minimizing the overlap between them. Our results show query parallel network motif finding speeds up the computation time linearly with the number of workers, and allows us to systematically search for larger motifs than ever before. In addition, our novel network partitioning algorithms are effective at partitioning the network, which enables network parallelization to effectively reduce the time required to count instances of a particular subgraph.