Fuzzy Data Mining: Discovery of Fuzzy Generalized Association Rules+

Data mining is a key step of knowledge discovery in databases. Classically, mining generalized association rules is to discover the relationships between data attributes upon all levels of presumed exact taxonomic structures. In many real-world applications, however, the taxonomic structures may not be crisp but fuzzy. This paper focuses on the issue of mining generalized association rules with fuzzy taxonomic structures. First, fuzzy extensions are made to the notions of the degree of support, the degree of confidence, and the R-interest measure. The computation of these degrees takes into account the fact that there may exist a partial belonging between any two itemsets in the taxonomy concerned. Then, the classical Srikant and Agrawal’s algorithm (including the Apriori algorithm and the Fast algorithm) is extended to allow discovering the relationships between data attributes upon all levels of fuzzy taxonomic structures. In this way, both crisp and fuzzy association rules can be discovered. Finally, the extended algorithm is run on the synthetic data with up to 106 transactions. It reveals that the extended algorithm is at the same level of computational complexity in the number of the transactions as that of the classical algorithm.