DirectDiscriminative Pattern Mining forEffective Classification

Theapplication offrequent patterns inclassification hasdemonstrated itspowerinrecent studies. Itoften adopts a two-step approach: frequent pattern (orclassification rule) min- ingfollowed byfeature selection (orruleranking). However, this two-step process couldbecomputationally expensive, especially whentheproblem scale islarge ortheminimumsupport islow. Itwasobserved thatfrequent pattern miningusually produces ahugenumberof"patterns" thatcouldnotonlyslowdownthe mining process butalso makefeature selection hardtocomplete. Inthispaper, we propose a direct discriminative pattern mining approach, DDPMine, totackle theefficiency issue arising fromthetwo-step approach. DDPMineperforms abranch-and- boundsearch fordirectly mining discriminative patterns without generating thecomplete pattern set.Instead ofselecting best patterns inabatch, weintroduce a"feature-centered" mining approach thatgenerates discriminative patterns sequentially on aprogressively shrinking FP-tree byincrementally eliminating training instances. Theinstance elimination effectively reduces theproblem sizeiteratively andexpedites themining process. Empirical results showthatDDPMineachieves orders ofmagni- tudespeedup without anydowngrade ofclassification accuracy. It outperforms thestate-of-the-art associative classification methods intermsofbothaccuracy andefficiency.

[1]  Mohammed J. Zaki,et al.  Lazy Associative Classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[2]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[3]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .