Efficient Mining of Top-k Breaker Emerging Subgraph Patterns from Graph Datasets

This paper introduces a new type of discriminative subgraph pattern called breaker emerging subgraph pattern by introducing three constraints and two new concepts: base and breaker. A breaker emerging subgraph pattern consists of three subpatterns: a constrained emerging subgraph pattern, a set of bases and a set of breakers. An efficient approach is proposed for the discovery of top-k breaker emerging subgraph patterns from graph datasets. Experimental results show that the approach is capable of efficiently discovering top-k breaker emerging subgraph patterns from given datasets, is more efficient than two previous methods for mining discriminative subgraph patterns. The discovered top-k breaker emerging subgraph patterns are more informative, more discriminative, more accurate and more compact than the minimal distinguishing subgraph patterns. The top-k breaker emerging patterns are more useful for substructure analysis, such as molecular fragment analysis.

[1]  Philip S. Yu,et al.  Direct Discriminative Pattern Mining for Effective Classification , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[2]  N. Nikolova,et al.  International Union of Pure and Applied Chemistry, LUMO energy ± The Lowest Unoccupied Molecular Orbital (LUMO) , 2022 .

[3]  R. Glen,et al.  Molecular similarity: a key technique in molecular informatics. , 2004, Organic & biomolecular chemistry.

[4]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[5]  Yu Wang,et al.  A novel efficient algorithm for determining maximum common subgraphs , 2005, Ninth International Conference on Information Visualisation (IV'05).

[6]  Jianyong Wang,et al.  Efficient Mining of Minimal Distinguishing Subgraph Patterns from Graph Databases , 2008, PAKDD.

[7]  J. Bailey,et al.  Efficient Mining of Contrast Patterns and Their Applications to Classification , 2005, 2005 3rd International Conference on Intelligent Sensing and Information Processing.

[8]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[9]  James Bailey,et al.  Mining Minimal Contrast Subgraph Patterns , 2006, SDM.

[10]  King-Sun Fu,et al.  A distance measure between attributed relational graphs for pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[11]  Jinyan Li,et al.  Efficient mining of emerging patterns: discovering trends and differences , 1999, KDD '99.

[12]  J. J. McGregor,et al.  Backtrack search algorithms and the maximal common subgraph problem , 1982, Softw. Pract. Exp..

[13]  Philip S. Yu,et al.  Direct mining of discriminative and essential frequent patterns via model-based search tree , 2008, KDD.

[14]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[15]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.