Engineering Motif Search for Large Graphs

In the graph motif problem, we are given as input a vertex-colored graph H (the host graph) and a multiset of colors M (the motif). Our task is to decide whether H has a connected set of vertices whose multiset of colors agrees with M. The graph motif problem is NP-complete but known to admit parameterized algorithms that run in linear time in the size of H. We demonstrate that algorithms based on constrained multilinear sieving are viable in practice, scaling to graphs with hundreds of millions of edges as long as M remains small. Furthermore, our implementation is topology-invariant relative to the host graph H, meaning only the most crude graph parameters (number of edges and number of vertices) suffce in practice to determine the algorithm performance.

[1]  Vijay Kumar,et al.  Efficient Rijndael Encryption Implementation with Composite Field Arithmetic , 2001, CHES.

[2]  Ron Y. Pinter,et al.  Algorithms for topology-free and alignment network queries , 2014, J. Discrete Algorithms.

[3]  Eli Biham,et al.  A Fast New DES Implementation in Software , 1997, FSE.

[4]  Riccardo Dondi,et al.  Maximum Motif Problem in Vertex-Colored Graphs , 2009, CPM.

[5]  Leslie Lamport,et al.  Multiple byte processing with full-word instructions , 1975, Commun. ACM.

[6]  Ron Y. Pinter,et al.  Partial Information Network Queries , 2013, IWOCA.

[7]  Ding-Zhu Du,et al.  Competitive Group Testing , 1993, Discret. Appl. Math..

[8]  Jacob T. Schwartz,et al.  Fast Probabilistic Algorithms for Verification of Polynomial Identities , 1980, J. ACM.

[9]  Ron Y. Pinter,et al.  Deterministic Parameterized Algorithms for the Graph Motif Problem , 2014, MFCS.

[10]  Christian Komusiewicz,et al.  Parameterized Algorithms and Hardness Results for Some Graph Motif Problems , 2008, CPM.

[11]  Andreas Björklund,et al.  Probably Optimal Graph Motifs , 2013, STACS.

[12]  Szymon Grabowski,et al.  New algorithms for binary jumbled pattern matching , 2013, Inf. Process. Lett..

[13]  Henry G. Dietz,et al.  Compiling for SIMD Within a Register , 1998, LCPC.

[14]  D. Du,et al.  Combinatorial Group Testing and Its Applications , 1993 .

[15]  Jesper Nederlof Fast Polynomial-Space Algorithms Using Möbius Inversion: Improving on Steiner Tree and Related Problems , 2009, ICALP.

[16]  Ryan Williams,et al.  Limits and Applications of Group Algebras for Parameterized Problems , 2009, ICALP.

[17]  Andreas Björklund,et al.  Determinant Sums for Undirected Hamiltonicity , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[18]  Wojciech Rytter,et al.  Efficient Indexes for Jumbled Pattern Matching with Constant-Sized Alphabet , 2016, Algorithmica.

[19]  Meirav Zehavi Parameterized Algorithms for Module Motif , 2013, MFCS.

[20]  Michael R. Fellows,et al.  Upper and lower bounds for finding connected motifs in vertex-colored graphs , 2011, J. Comput. Syst. Sci..

[21]  David Naccache,et al.  Cryptographic Hardware and Embedded Systems — CHES 2001 , 2001 .

[22]  Michael R. Fellows,et al.  Sharp Tractability Borderlines for Finding Connected Motifs in Vertex-Colored Graphs , 2007, ICALP.

[23]  Daniel Lokshtanov,et al.  Saving space by algebraization , 2010, STOC '10.

[24]  Ryan Williams,et al.  Finding paths of length k in O*(2k) time , 2008, Inf. Process. Lett..

[25]  Alexandru I. Tomescu,et al.  Indexes for Jumbled Pattern Matching in Strings, Trees and Graphs , 2013, SPIRE.

[26]  Sylvain Guillemot,et al.  Finding and Counting Vertex-Colored Subtrees , 2010, Algorithmica.

[27]  Andreas Björklund,et al.  Constrained Multilinear Detection and Generalized Graph Motifs , 2012, Algorithmica.

[28]  Cristina G. Fernandes,et al.  Motif Search in Graphs: Application to Metabolic Networks , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[29]  Andreas Björklund,et al.  Narrow sieves for parameterized paths and packings , 2010, J. Comput. Syst. Sci..

[30]  Ioannis Koutis Constrained multilinear detection for faster functional motif discovery , 2012, Inf. Process. Lett..

[31]  E. Lawler A PROCEDURE FOR COMPUTING THE K BEST SOLUTIONS TO DISCRETE OPTIMIZATION PROBLEMS AND ITS APPLICATION TO THE SHORTEST PATH PROBLEM , 1972 .

[32]  Richard J. Lipton,et al.  A Probabilistic Remark on Algebraic Program Testing , 1978, Inf. Process. Lett..

[33]  Ioannis Koutis,et al.  Faster Algebraic Algorithms for Path and Packing Problems , 2008, ICALP.

[34]  Andreas Björklund,et al.  Fast Witness Extraction Using a Decision Oracle , 2014, ESA.

[35]  Henry S. Warren,et al.  Hacker's Delight , 2002 .

[36]  Richard Zippel,et al.  Probabilistic algorithms for sparse polynomials , 1979, EUROSAM.

[37]  Michael E. Kounavis,et al.  Efficient implementation of the Galois Counter Mode using a carry-less multiplier and a fast reduction algorithm , 2010, Inf. Process. Lett..

[38]  Jérôme Kunegis,et al.  KONECT: the Koblenz network collection , 2013, WWW.