An algorithm for mining generalized sequential patterns

Sequential pattern mining is an important data mining problem with broad applications. Algorithm GSP discovers generalized sequential patterns. However, GSP still encounters problems when a sequence database is large and/or when sequential patterns to be mined are long. Algorithm PrefixSpan mines complete sequential patterns faster than GSP but it cannot mine generalized sequential patterns with time constraints, time windows and/or taxonomy. In this paper, a new enhanced method based on PrefixSpan, is proposed, called EPSpan, which absorbs the spirit of PrefixSpan and extends PrefixSpan towards mining generalized sequential patterns.