Mining Long Patterns of Least-Support Items in Stream

The mining task of finding long sequential pattern has been well studied for years. Typical algorithms often apply vary cascading support counting methods, including the basic apriori algorithm, FP-growth, and other derived algorithms. It is commonly known that during the mining process the items with very high support may lead to poor time performance and very huge useless branch search space, especially when the items in fact are not the member of the end long pattern. On the other hand the items with least user specified support, but will be the member of long pattern, might be discarded easily. This problem could be more challenging in scenarios where the data source is stream data, for data stream being unbounded, time-varied and un-revisited. We carefully considered the role of hidden Markov chain structure and then checked the item frequency evolution in stream mining context. In this paper we presented a method of mining long patterns for data stream application scenarios. Our algorithm can well overcome the negative effects generated in stream scenarios.