Discovery of Direct and Indirect Sequential Patterns with Multiple Minimum Supports in Large Transaction Databases

Sequential patterns mining is an important research topic in data mining and knowledge discovery. The objective of mining sequential patterns is to find out frequent sequences based on the user-specified minimum support threshold, which implicitly assumes that all items in the data have similar probability distribution. This is often not the case in real-life applications. If the frequencies of items vary a great deal, we will suffer the dilemma called the rare item problem. In order to resolve the dilemma, an algorithm to discover sequential patterns with multiple minimum supports model is proposed, which can specify a different minimum item support for different item. The algorithm can not only discover sequential patterns formed between frequent sequences, but also discover sequential patterns formed either between frequent sequence and rare sequence or among rare sequences only. Moreover, an algorithm for mining direct and indirect sequential patterns with multiple minimum supports is designed simultaneously.

[1]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[2]  Wynne Hsu,et al.  Mining association rules with multiple minimum supports , 1999, KDD '99.

[3]  Jaideep Srivastava,et al.  Indirect Association: Mining Higher Order Dependencies in Data , 2000, PKDD.

[4]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.