A proposal for a model for dealing with value-based data dependencies to improve the rule discovery process

The discovery of conjunctive "if-then" classification rules may be intractable when enumerating all possible conjunctions of terms. Various algorithms, notably C4.5 and CART, adopt a univariate strategy which reduces the process to a one-at-a-time best variable type of approach. While computationally feasible, such an approach may lead to unexplored portions of the database which may contain valuable nuggets. On the other hand, an exhaustive evaluation of all possible conjunctions may be intractable even for relatively small datasets. We propose a general approach to reduce the size of the search space of conjunctive "if-then" rule discovery algorithms by exploiting value-based data dependencies existing among the independent variables.