Algorithms for Finding Motifs in Large Labeled Networks

The goal of this chapter is to introduce the different kinds of subgraph analysis problems and discuss some of the important parallel algorithmic techniques that have been developed for them. This chapter focuses primarily on the problem of counting the number of occurrences of a given subgraph. We consider some special classes of subgraphs including trees, triangles and cliques.

[1]  Sergei Vassilvitskii,et al.  Counting triangles and the curse of the last reducer , 2011, WWW.

[2]  Dana Ron,et al.  Counting stars and other small subgraphs in sublinear time , 2010, SODA '10.

[3]  Nigel Shadbolt,et al.  Resource Description Framework (RDF) , 2009 .

[4]  Noga Alon,et al.  Biomolecular network motif counting and discovery by color coding , 2008, ISMB.

[5]  Jiawei Han,et al.  Mining closed relational graphs with connectivity constraints , 2005, 21st International Conference on Data Engineering (ICDE'05).

[6]  Philip N. Klein,et al.  A Randomized Parallel Algorithm for Single-Source Shortest Paths , 1997, J. Algorithms.

[7]  Panos M. Pardalos,et al.  The maximum clique problem , 1994, J. Glob. Optim..

[8]  Alberto O. Mendelzon,et al.  Expressing structural hypertext queries in graphlog , 1989, Hypertext.

[9]  Daniel J. Abadi,et al.  Scalable SPARQL querying of large RDF graphs , 2011, Proc. VLDB Endow..

[10]  Oded Shmueli,et al.  Evaluating very large datalog queries on social networks , 2009, EDBT '09.

[11]  Ryan Williams,et al.  Finding, minimizing, and counting weighted subgraphs , 2009, STOC '09.

[12]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[13]  George Karypis,et al.  Frequent Substructure-Based Approaches for Classifying Chemical Compounds , 2005, IEEE Trans. Knowl. Data Eng..

[14]  Andrzej Lingas,et al.  Counting and detecting small subgraphs via equations and matrix multiplication , 2011, SODA '11.

[15]  E. Loukakis A new backtracking algorithm for generating the family of maximal independent sets of a graph , 1983 .

[16]  Yuval Shavitt,et al.  Approximating the Number of Network Motifs , 2009, Internet Math..

[17]  Madhav V. Marathe,et al.  SAHAD: Subgraph Analysis in Massive Networks Using Hadoop , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[18]  Richard E. Schantz,et al.  Clause-iteration with MapReduce to scalably query datagraphs in the SHARD graph-store , 2011, DIDC '11.

[19]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[20]  Jure Leskovec,et al.  Patterns of Influence in a Recommendation Network , 2006, PAKDD.

[21]  Jonathan W. Berry,et al.  Graph Analysis with High-Performance Computing , 2008, Computing in Science & Engineering.

[22]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[23]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[24]  Charalampos E. Tsourakakis,et al.  Colorful triangle counting and a MapReduce implementation , 2011, Inf. Process. Lett..

[25]  David A. Bader,et al.  National Laboratory Lawrence Berkeley National Laboratory Title A Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets Permalink , 2009 .

[26]  David A. Bader,et al.  An Experimental Study of A Parallel Shortest Path Algorithm for Solving Large-Scale Graph Instances , 2007, ALENEX.

[27]  Jonathan W. Berry,et al.  Challenges in Parallel Graph Processing , 2007, Parallel Process. Lett..

[28]  Thomas Schank,et al.  Algorithmic Aspects of Triangle-Based Network Analysis , 2007 .

[29]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[30]  Naren Ramakrishnan,et al.  Diagnosing memory leaks using graph mining on heap dumps , 2010, KDD.

[31]  Paolo Bonzini,et al.  Polynomial-time subgraph enumeration for automated instruction set extension , 2007 .

[32]  Jianyong Wang,et al.  Out-of-core coherent closed quasi-clique mining from large dense graph databases , 2007, TODS.

[33]  Vipin Kumar,et al.  State of the Art in Parallel Search Techniques for Discrete Optimization Problems , 1999, IEEE Trans. Knowl. Data Eng..

[34]  Sherif Sakr,et al.  GraphREL: A Decomposition-Based and Selectivity-Aware Relational Framework for Processing Sub-graph Queries , 2009, DASFAA.

[35]  E. A. Akkoyunlu,et al.  The Enumeration of Maximal Cliques of Large Graphs , 1973, SIAM J. Comput..

[36]  Noga Alon,et al.  Color-coding , 1995, JACM.

[37]  Eugene L. Lawler,et al.  Generating all Maximal Independent Sets: NP-Hardness and Polynomial-Time Algorithms , 1980, SIAM J. Comput..

[38]  Igor Jurisica,et al.  Efficient estimation of graphlet frequency distributions in protein-protein interaction networks , 2006, Bioinform..

[39]  Venkatesh Raman,et al.  Approximate Counting small subgraphs of bounded treewidth and related problems , 2002, Electron. Colloquium Comput. Complex..

[40]  Natasa Przulj,et al.  Biological network comparison using graphlet degree distribution , 2007, Bioinform..

[41]  Klemens Böhm,et al.  Mining Edge-Weighted Call Graphs to Localise Software Bugs , 2008, ECML/PKDD.

[42]  Madhav V. Marathe,et al.  Subgraph Enumeration in Large Social Contact Networks Using Parallel Color Coding and Streaming , 2010, 2010 39th International Conference on Parallel Processing.

[43]  Thomas Zichner,et al.  Algorithm Engineering for Color-Coding with Applications to Signaling Pathway Detection , 2008, Algorithmica.

[44]  Jiong Yang,et al.  SPIN: mining maximal frequent subgraphs from graph databases , 2004, KDD.

[45]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[46]  Lise Getoor,et al.  Link mining: a survey , 2005, SKDD.

[47]  Nagiza F. Samatova,et al.  A scalable, parallel algorithm for maximal clique enumeration , 2009, J. Parallel Distributed Comput..

[48]  Christos Faloutsos,et al.  DOULION: counting triangles in massive graphs with a coin , 2009, KDD.

[49]  Sriram Raghavan,et al.  Representing Web graphs , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[50]  George Karypis,et al.  Finding Frequent Patterns in a Large Sparse Graph* , 2005, Data Mining and Knowledge Discovery.

[51]  V. S. Subrahmanian,et al.  COSI: Cloud Oriented Subgraph Identification in Massive Social Networks , 2010, 2010 International Conference on Advances in Social Networks Analysis and Mining.

[52]  Friedrich Eisenbrand,et al.  On the complexity of fixed parameter clique and dominating set , 2004, Theor. Comput. Sci..

[53]  Ralf Hartmut Güting,et al.  GraphDB: Modeling and Querying Graphs in Databases , 1994, VLDB.