Structural Graph Indexing for Mining Complex Networks

Systems such as proteins, chemical compounds, and the Internet are being modeled as complex networks to identify local and global characteristics of the system. In many instances, these graphs are very large in size presenting challenges in their analysis. Hence, graph indexing techniques are developed to enhance various graph mining algorithms. In this paper, we propose a new Structural Graph Indexing (SGI) technique that does not limit the number of nodes in indexing to provide an alternative tool for graph mining algorithms. As indexing feature, we use common graph structures, namely, star, complete bipartite, triangle and clique, that frequently appear in protein, chemical compound, and Internet graphs. Note that, SGI lists all substructures matching structure formulations and other graph structures can be identified and added to the SGI.

[1]  A. John MINING GRAPH DATA , 2022 .

[2]  Dan Suciu,et al.  Index Structures for Path Expressions , 1999, ICDT.

[3]  Lawrence B. Holder,et al.  Mining Graph Data: Cook/Mining Graph Data , 2006 .

[4]  Jason Cong,et al.  A Parallel Bottom-up Clustering Algorithm with Applications to Circuit Partitioning in VLSI Design , 1993, 30th ACM/IEEE Design Automation Conference.

[5]  R Samudrala,et al.  A graph-theoretic algorithm for comparative modeling of protein structure. , 1998, Journal of molecular biology.

[6]  Philip S. Yu,et al.  Graph indexing: a frequent structure-based approach , 2004, SIGMOD '04.

[7]  C. Cannings,et al.  On the structure of protein-protein interaction networks. , 2003, Biochemical Society transactions.

[8]  Dennis Shasha,et al.  Algorithmics and applications of tree and graph searching , 2002, PODS.

[9]  René Peeters,et al.  The maximum edge biclique problem is NP-complete , 2003, Discret. Appl. Math..

[10]  Yuval Shavitt,et al.  DIMES: let the internet measure itself , 2005, CCRV.

[11]  John Michael Robson,et al.  Algorithms for Maximum Independent Sets , 1986, J. Algorithms.

[12]  Ratul Mahajan,et al.  Measuring ISP topologies with rocketfuel , 2002, TNET.

[13]  Kamil Saraç,et al.  Resolving Anonymous Routers in Internet Topology Measurement Studies , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[14]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[15]  Peter Willett,et al.  CLIP: Similarity Searching of 3D Databases Using Clique Detection , 2003, J. Chem. Inf. Comput. Sci..

[16]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[17]  Ehud Gudes,et al.  Exploiting local similarity for indexing paths in graph-structured data , 2002, Proceedings 18th International Conference on Data Engineering.

[18]  W. Marsden I and J , 2012 .

[19]  Arun Venkataramani,et al.  iPlane: an information plane for distributed services , 2006, OSDI '06.