论文信息 - The Probabilistic Algorithm for Mining Frequent Sequences

The Probabilistic Algorithm for Mining Frequent Sequences

The subject of the paper is to analyze the problem of the frequency of the subsequences in large volume sequences (texts, databases, etc.). A new algorithm ProMFS for mining frequent sequences is proposed. It is based on the estimated probabilistic-statistical characteristics of the appearance of elements of the sequence and their order. The algorithm builds a new much shorter sequence and makes decisions on the main sequence in accordance with the results of analysis of the shorter one.

Gintautas Dzemyda | Romanas Tumasonis

[1] Umeshwar Dayal,et al. FreeSpan: frequent pattern-projected sequential pattern mining , 2000, KDD '00.

[2] Ramesh C Agarwal,et al. Depth first generation of long patterns , 2000, KDD '00.

[3] Jian Pei,et al. ApproxMAP: Approximate Mining of Consensus Sequential Patterns , 2003, SDM.

[4] Johannes Gehrke,et al. Sequential PAttern mining using a bitmap representation , 2002, KDD.

[5] Jian Pei,et al. Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[6] Mohammed J. Zaki. Parallel Sequence Mining on Shared-Memory Machines , 1999, Large-Scale Parallel Data Mining.

[7] Mohammed J. Zaki,et al. SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[8] Jian Pei,et al. Mining sequential patterns with constraints in large databases , 2002, CIKM '02.

[9] Mohammed J. Zaki. Parallel Sequence Mining on Shared-Memory Machines , 1999, J. Parallel Distributed Comput..

[10] Qiming Chen,et al. PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth , 2001, Proceedings 17th International Conference on Data Engineering.

[11] Umeshwar Dayal,et al. Multi-dimensional sequential pattern mining , 2001, CIKM '01.

[12] Srinivasan Parthasarathy,et al. Parallel Algorithms for Discovery of Association Rules , 1997, Data Mining and Knowledge Discovery.