论文信息 - Efficient enumeration of frequent sequences

Efficient enumeration of frequent sequences

In this paper we present SPADE, a new algorithm for fast discovery of Sequential Patterns. The existing solutions to this problem make repeated database scans, and use complex hash structures which have poor locality. SPADE utilizes combinatorial properties to decompose the original problem into smaller sub-problems, that can be independently solved in main-memory using efficient lattice search techniques, and using simple join operations. All sequences are discovered in only three database scans. Experiments show that SPADE outperforms the best previous algorithm by a factor of two, and by an order of magnitude with some pre-processed data. It also has linear scalability with respect to the number of customers, and a number of other database parameters.

Mohammed J. Zaki

[1] Brian A. Davey,et al. An Introduction to Lattices and Order , 1989 .

[2] Srinivasan Parthasarathy,et al. New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[3] Ramakrishnan Srikant,et al. Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[4] Heikki Mannila,et al. Discovering Generalized Episodes Using Minimal Occurrences , 1996, KDD.

[5] Heikki Mannila,et al. Discovering Frequent Episodes in Sequences , 1995, KDD.

[6] Heikki Mannila,et al. Knowledge discovery from telecommunication network alarm databases , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[7] Heikki Mannila,et al. Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[8] Srinivasan Parthasarathy,et al. Memory Placement Techniques for Parallel Association Mining , 1998, KDD.

[9] Ramakrishnan Srikant,et al. Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[10] David D. Jensen,et al. A Family of Algorithms for Finding Temporal Structure in Data , 1997 .

[11] Mohammed J. Zaki,et al. PlanMine: Sequence Mining for Plan Failures , 1998, KDD.

[12] Shamkant B. Navathe,et al. An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.