Dryade: a new approach for discovering closed frequent trees in heterogeneous tree databases

In this paper we present a novel algorithm for discovering tree patterns in a tree database. This algorithm uses a relaxed tree inclusion definition, making the problem more complex (checking tree inclusion is NP-complete), but allowing to mine highly heterogeneous databases. To obtain good performances, our DRYADE algorithm, discovers only closed frequent tree patterns.

[1]  Alexandre Termier,et al.  TreeFinder: a first step towards XML data mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[2]  Pekka Kilpeläinen,et al.  Tree Matching Problems with Applications to Structured Text Databases , 2022 .

[3]  Michèle Sebag,et al.  Relational Learning as Search in a Critical Region , 2003, J. Mach. Learn. Res..

[4]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[5]  Mohammed J. Zaki,et al.  CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[6]  Mohammed J. Zaki Efficiently mining frequent trees in a forest , 2002, KDD.

[7]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[8]  Yun Chi,et al.  HybridTreeMiner: an efficient algorithm for mining frequent rooted trees and free trees using canonical forms , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[9]  Yun Chi,et al.  CMTreeMiner: Mining Both Closed and Maximal Frequent Subtrees , 2004, PAKDD.

[10]  Christian Borgelt,et al.  EFFICIENT IMPLEMENTATIONS OF APRIORI AND ECLAT , 2003 .

[11]  Luc Dehaspe Frequent Pattern Discovery in First-Order Logic , 1999, AI Commun..

[12]  Hiroki Arimura,et al.  Efficient Substructure Discovery from Large Semi-Structured Data , 2001, IEICE Trans. Inf. Syst..

[13]  Hiroki Arimura,et al.  Discovering Frequent Substructures in Large Unordered Trees , 2003, Discovery Science.