Learning hierarchical rule sets

We present an algorithm for learning sets of rules that are organized into up to <italic>k</italic> levels. Each level can contain an arbitrary number of rules “if <italic>c</italic> then <italic>l</italic>” where <italic>l</italic> is the class associated to the level and <italic>c</italic> is a concept from a given class of basic concepts. The rules of higher levels have precedence over the rules of lower levels and can be used to represent exceptions. As basic concepts we can use Boolean attributes in the infinite attribute space model, or certain concepts defined in terms of substrings. Given a sample of <italic>m</italic> examples, the algorithm runs in polynomial time and produces a consistent representation of size <italic>O((log m)<supscrpt>k</supscrpt>n<supscrpt>k</supscrpt>), where <italic>n</italic> is the size of the smallest consistent representation with <italic>k</italic> levels of rules. This implies that the algorithm learns in the PAC model. The algorithm repeatedly applies the greedy heuristics for weighted set cover. The weights are obtained from approximate solutions to previous set cover problems.