An efficient method for mining non-redundant sequential rules using attributed prefix-trees

Abstract Mining sequential generator patterns and mining closed sequential patterns are important to sequence mining. They address difficulties in the mining of sequential patterns by reducing the number of sequential rules, and the results are often used to generate non-redundant sequential rules. This paper proposes an efficient method for mining non-redundant sequential rules from an attributed prefix-tree. The proposed method has two phases. In the first phase, it builds a prefix-tree that stores all the sequential patterns from a given sequence database. Then in the second phase, it mines non-redundant sequential rules from this prefix-tree. In the prefix-tree building process, each node on the prefix-tree has a field that indicates whether this node is a minimal sequential generator pattern, and another field that indicates whether this node is a closed sequential pattern. By traversing the prefix-tree, non-redundant sequential rules can be easily mined from a minimal sequential generator pattern X to a closed sequential pattern Y such that X is a prefix of Y . In addition, a good pruning mechanism is proposed to reduce the search space and the execution time in the mining process. Experimental results show that the proposed method is more efficient than existing methods in mining non-redundant sequential rules.

[1]  Don-Lin Yang,et al.  Using Data Mining to Study Upstream and Downstream Causal Relationship in Stock Market , 2006, JCIS.

[2]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[3]  Bay Vo,et al.  Interestingness measures for association rules: Combination between lattice and hash tables , 2011, Expert Syst. Appl..

[4]  Mohammed J. Zaki,et al.  Prism: A Primal-Encoding Approach for Frequent Sequence Mining , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[5]  Huy Nguyen,et al.  An efficient strategy for mining high utility itemsets , 2011, Int. J. Intell. Inf. Database Syst..

[6]  Tzung-Pei Hong,et al.  A Dynamic Bit-vector Approach for Efficiently Mining Inter-sequence Patterns , 2012, 2012 Third International Conference on Innovations in Bio-Inspired Computing and Applications.

[7]  Jitender S. Deogun,et al.  Sequential Association Rule Mining with Time Lags , 2004, Journal of Intelligent Information Systems.

[8]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[9]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[10]  Jian Pei,et al.  Sequence Data Mining , 2007, Advances in Database Systems.

[11]  Tzung-Pei Hong,et al.  Genetic-Fuzzy Data Mining With Divide-and-Conquer Strategy , 2008, IEEE Transactions on Evolutionary Computation.

[12]  Siau-Cheng Khoo,et al.  Non-redundant sequential rules - Theory and algorithm , 2009, Inf. Syst..

[13]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[14]  Philip S. Yu,et al.  Mining high utility episodes in complex event sequences , 2013, KDD.

[15]  Anthony J. T. Lee,et al.  Mining Web navigation patterns with a path traversal graph , 2011, Expert Syst. Appl..

[16]  Tzung-Pei Hong,et al.  MSGPs: A Novel Algorithm for Mining Sequential Generator Patterns , 2012, ICCCI.

[17]  Tzung-Pei Hong,et al.  Classification based on association rules: A lattice-based approach , 2012, Expert Syst. Appl..

[18]  Vincent S. Tseng,et al.  RuleGrowth: mining sequential rules common to several sequences by pattern-growth , 2011, SAC.

[19]  Hahn-Ming Lee,et al.  Stock Trend Prediction by Sequential Chart Pattern via K-Means and AprioriAll Algorithm , 2012, 2012 Conference on Technologies and Applications of Artificial Intelligence.

[20]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[21]  Yue Xu,et al.  Non-Redundant Sequential Association Rule Mining and Application in Recommender Systems , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[22]  Mohammed J. Zaki,et al.  Prism: An effective approach for frequent sequence mining via prime-block encoding , 2010, J. Comput. Syst. Sci..

[23]  Philippe Fournier-Viger,et al.  A Generic Episodic Learning Model Implemented in a Cognitive Agent by Means of Temporal Pattern Mining , 2009, IEA/AIE.

[24]  Frans Coenen,et al.  A new method for mining Frequent Weighted Itemsets based on WIT-trees , 2013, Expert Syst. Appl..

[25]  Gerd Stumme,et al.  Generating a Condensed Representation for Association Rules , 2005, Journal of Intelligent Information Systems.

[26]  Tzung-Pei Hong,et al.  A lattice-based approach for mining most generalization association rules , 2013, Knowl. Based Syst..

[27]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.

[28]  Myra Spiliopoulou,et al.  Managing Interesting Rules in Sequence Mining , 1999, PKDD.

[29]  Bay Vo,et al.  Mining Sequential Rules Based on Prefix-Tree , 2011, ACIIDS Posters.

[30]  K. S. Adewole,et al.  Stock Trend Prediction Using Regression Analysis - A Data Mining Approach , 2011 .

[31]  Engelbert Mephu Nguifo,et al.  CMRules: Mining sequential rules common to several sequences , 2012, Knowl. Based Syst..

[32]  Jianyong Wang,et al.  Mining sequential patterns by pattern-growth: the PrefixSpan approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[33]  Siau-Cheng Khoo,et al.  SMArTIC: towards building an accurate, robust and scalable specification miner , 2006, SIGSOFT '06/FSE-14.

[34]  Vincent S. Tseng,et al.  Efficient algorithms for discovering high utility user behavior patterns in mobile commerce environments , 2013, Knowledge and Information Systems.

[35]  Engelbert Mephu Nguifo,et al.  CMRULES: An Efficient Algorithm for Mining Sequential Rules Common to Several Sequences , 2010, FLAIRS.

[36]  Vincent S. Tseng,et al.  Mining Sequential Rules Common to Several Sequences with the Window Size Constraint , 2012, Canadian Conference on AI.

[37]  Howard J. Hamilton,et al.  The TIMERS II Algorithm for the Discovery of Causality , 2005, PAKDD.

[38]  Joongmin Choi,et al.  FEROM: Feature Extraction and Refinement for Opinion Mining , 2011 .

[39]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.