Knowledge Graph Constraints for Multi-label Graph Classification

Graph classification methods have gained increasing attention in different domains, such as classifying functions of molecules or detection of bugs in software programs. Similarly, predicting events in manufacturing operations data can be compactly modeled as graph classification problem. Feature representations of graphs are usually found by mining discriminative sub-graph patterns that are non-uniformly distributed across class labels. However, as these feature selection approaches are computationally expensive for multiple labels, prior knowledge about label correlations should be exploited as much as possible. In this work, we introduce a new approach for mining discriminative sub-graph patterns with constraints that are extracted from links between labels in knowledge graphs which indicate label correlations. The incorporation of these constraints allows to prune the search space and ensures extraction of consistent patterns. Therefore, constraint checking remains efficient and more robust classification results can be obtained. We evaluate our approach on both, one public and one custom simulated data set. Evaluation confirms that incorporation of constraints still results in efficient pattern mining and can increase performance of state-of-the-art approaches.

[1]  Philip S. Yu,et al.  Mining significant graph patterns by leap search , 2008, SIGMOD Conference.

[2]  Philip S. Yu,et al.  gPrune: A Constraint Pushing Framework for Graph Pattern Mining , 2007, PAKDD.

[3]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[4]  Lawrence B. Holder,et al.  Graph-based relational learning: current and future directions , 2003, SKDD.

[5]  Steffen Lamparter,et al.  Semantic-Guided Feature Selection for Industrial Automation Systems , 2015, SEMWEB.

[6]  Philip S. Yu,et al.  Meta path-based collective classification in heterogeneous information networks , 2012, CIKM.

[7]  James T. Kwok,et al.  Multilabel Classification with Label Correlations and Missing Labels , 2014, AAAI.

[8]  Gerhard Weikum,et al.  Graph-based text classification: learn from your neighbors , 2006, SIGIR.

[9]  Philip S. Yu,et al.  Near-optimal Supervised Feature Selection among Frequent Subgraphs , 2009, SDM.

[10]  Philip S. Yu,et al.  Under Consideration for Publication in Knowledge and Information Systems Gmlc: a Multi-label Feature Selection Framework for Graph Classification , 2011 .

[11]  Charu C. Aggarwal,et al.  Relation Strength-Aware Clustering of Heterogeneous Information Networks with Incomplete Attributes , 2012, Proc. VLDB Endow..

[12]  Philip S. Yu,et al.  Multi-label classification by mining label and instance correlations from heterogeneous information networks , 2013, KDD.

[13]  Shirui Pan,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Graph Classification with Imbalanced Class Distributions and Noise ∗ , 2022 .

[14]  Philip S. Yu,et al.  Bag Constrained Structure Pattern Mining for Multi-Graph Classification , 2014, IEEE Transactions on Knowledge and Data Engineering.