An Optimization Strategy for CFDMiner: An Algorithm of Discovering Constant Conditional Functional Dependencies

Compared to the traditional functional dependency (FD), the extended conditional functional dependency (CFD) has shown greater potential for detecting and repairing inconsistent data. CFDMiner is a widely used algorithm for mining constant-CFDs. But the search space of CFDMiner is too large, and there is still room for efficiency improvement. In this paper, an efficient pruning strategy is proposed to optimize the algorithm by reducing the search space. Both theoretical analysis and experiments have proved the optimized algorithm can produce the consistent results as the original CFDMiner. key words: Data Quality, conditional functional dependency, free itemset, closed itemset, frequent itemset