ERMiner: Sequential Rule Mining Using Equivalence Classes

Sequential rule mining is an important data mining task with wide applications. The current state-of-the-art algorithm (RuleGrowth) for this task relies on a pattern-growth approach to discover sequential rules. A drawback of this approach is that it repeatedly performs a costly database projection operation, which deteriorates performance for datasets containing dense or long sequences. In this paper, we address this issue by proposing an algorithm named ERMiner (Equivalence class based sequential Rule Miner) for mining sequential rules. It relies on the novel idea of searching using equivalence classes of rules having the same antecedent or consequent. Furthermore, it includes a data structure named SCM (Sparse Count Matrix) to prune the search space. An extensive experimental study with five real-life datasets shows that ERMiner is up to five times faster than RuleGrowth but consumes more memory.

[1]  Manuel Campos,et al.  Fast Vertical Mining of Sequential Patterns Using Co-occurrence Information , 2014, PAKDD.

[2]  Hae Young Noh,et al.  Exploring Sequential and Association Rule Mining for Pattern-based Energy Demand Characterization , 2013, BuildSys@SenSys.

[3]  Siau-Cheng Khoo,et al.  Non-redundant sequential rules - Theory and algorithm , 2009, Inf. Syst..

[4]  Naouel Moha,et al.  Improving SOA antipatterns detection in Service Based Systems by mining execution traces , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[5]  Bernard Kamsu-Foguem,et al.  Mining association rules for the quality improvement of the production process , 2013, Expert Syst. Appl..

[6]  Mario Cannataro,et al.  Protein-to-protein interactions: Technologies, databases, and algorithms , 2010, CSUR.

[7]  Chengqi Zhang,et al.  Mining Both Positive and Negative Impact-Oriented Sequential Rules from Transactional Data , 2009, PAKDD.

[8]  Andreas D. Lattner,et al.  Towards assisted input and output data analysis in manufacturing simulation: The EDASim approach , 2012, Proceedings Title: Proceedings of the 2012 Winter Simulation Conference (WSC).

[9]  Tzung-Pei Hong,et al.  An efficient method for mining non-redundant sequential rules using attributed prefix-trees , 2014, Eng. Appl. Artif. Intell..

[10]  Vincent S. Tseng,et al.  Using Partially-Ordered Sequential Rules to Generate More Accurate Sequence Prediction , 2012, ADMA.

[11]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[12]  Nizar R. Mabroukeh,et al.  A taxonomy of sequential pattern mining algorithms , 2010, CSUR.

[13]  Vincent S. Tseng,et al.  RuleGrowth: mining sequential rules common to several sequences by pattern-growth , 2011, SAC.

[14]  Engelbert Mephu Nguifo,et al.  CMRules: Mining sequential rules common to several sequences , 2012, Knowl. Based Syst..