Mining generalized association rules for sequential and path data

While association rules for set data use and describe relations between parts of set valued objects completely, association rules for sequential data are restricted by specific interpretations of the subsequence relation: contiguous subsequences describe local features of a sequence valued object, noncontiguous subsequences its global features. We model both types of features with generalized subsequences that describe local deviations by wild cards, and present a new algorithm of a priori type for mining all generalized subsequences with prescribed minimum support from a given database of sequences. Furthermore we show that the given algorithm automatically takes into account an eventually underlying graph structure, i.e., is applicable to path data also.