Fast Apriori-based Graph Mining Algorithm and application to 3-dimensional Structure Analysis

Apriori-based Graph Mining (AGM) algorithm efficiently extracts all the subgraph patterns which frequently appear in graph structured data. The algorithm can deal with general graph structured data with multiple labels of vartices and edges, and is capable of analyzing the topological structure of graphs. In this paper, we propose a new method to analyze graph structured data for a 3-dimensional coordinate by AGM. In this method the distance between each vertex of a graph is calculated and added to the edge label so that AGM can handle 3-dimensional graph structured data. One problem in our approach is that the number of edge labels increases, which results in the increase of computational time to extract subgraph patterns. To alleviate this problem, we also propose a faster algorithm of AGM by adding an extra constraint to reduce the number of generated candidates for seeking frequent subgraphs. Chemical compounds with dopamine antagonist in MDDR database were analyzed by AGM to characterize their 3-dimensional chemical structure and correlation with physiological activity.

[1]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[2]  Luc De Raedt,et al.  The Levelwise Version Space Algorithm and its Application to Molecular Fragment Finding , 2001, IJCAI.

[3]  Zheng Li,et al.  Optimization of Electric Field Distribution by Free Carrier Injection in Silicon Detectors Operated at Low Temperatures , 2000 .

[4]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[5]  G. Bemis,et al.  The properties of known drugs. 1. Molecular frameworks. , 1996, Journal of medicinal chemistry.

[6]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[7]  Toshio Fujita,et al.  Three-dimensional structure-activity relationships of synthetic pyrethroids: 2. Three-dimensional and classical QSAR studies. , 2000 .

[8]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[9]  Hiroshi Motoda,et al.  CLIP: Concept Learning from Inference Patterns , 1995, Artif. Intell..

[10]  Hannu Toivonen,et al.  Finding Frequent Substructures in Chemical Compounds , 1998, KDD.

[11]  Hiroaki Kato Automated Identification of Three-Dimensional Common Structural Features of Proteins , 2001 .

[12]  Toshio Fujita,et al.  A Novel Three‐Dimensional QSAR Procedure: Voronoi Field Analysis , 1998 .

[13]  G. Bemis,et al.  Properties of known drugs. 2. Side chains. , 1999, Journal of medicinal chemistry.

[14]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[15]  Hiroshi Motoda,et al.  Machine Learning Techniques to Make Computers Easier to Use , 1997, IJCAI.

[16]  Lawrence B. Holder,et al.  Substructure Discovery Using Minimum Description Length and Background Knowledge , 1993, J. Artif. Intell. Res..

[17]  John A. Bernard,et al.  Expert Systems Applications Within the Nuclear Industry , 1989 .