Incremental mining of weighted maximal frequent itemsets from dynamic databases

We propose an incremental mining algorithm that finds weighted maximal frequent itemsets.We devise strategies for guaranteeing correctness of the proposed algorithm.We suggest performance improving techniques for the incremental pattern mining.We provide extensive, comprehensive performance evaluation results. Frequent itemset mining allows us to find hidden, important information from large databases. Moreover, processing incremental databases in the itemset mining area has become more essential because a huge amount of data has been accumulated continually in a variety of application fields and users want to obtain mining results from such incremental data in more efficient ways. One of the major problems in incremental itemset mining is that the corresponding mining results can be very large-scale according to threshold settings and data volumes. In addition, it is considerably hard to analyze all of them and find meaningful information. Furthermore, not all of the mining results become actually important information. In this paper, to solve these problems, we propose an algorithm for mining weighted maximal frequent itemsets from incremental databases. By scanning a given incremental database only once, the proposed algorithm can not only conduct its mining operations suitable for the incremental environment but also extract a smaller number of important itemsets compared to previous approaches. The proposed method also has an effect on expert and intelligent systems since it can automatically provide more meaningful pattern results reflecting characteristics of given incremental databases and threshold settings, which can help users analyze the given data more easily. Our comprehensive experimental results show that the proposed algorithm is more efficient and scalable than previous state-of-the-art algorithms.

[1]  Keun Ho Ryu,et al.  Fast algorithm for high utility pattern mining with the sum of item quantities , 2016, Intell. Data Anal..

[2]  Mong-Li Lee,et al.  Incremental Mining of Top-k Maximal Influential Paths in Network Data , 2013, Trans. Large Scale Data Knowl. Centered Syst..

[3]  R. S. Thakur,et al.  Maximal Pattern Mining Using Fast CP-Tree for Knowledge Discovery , 2012, Int. J. Inf. Syst. Soc. Chang..

[4]  Keun Ho Ryu,et al.  Discovering high utility itemsets with multiple minimum supports , 2014, Intell. Data Anal..

[5]  Unil Yun,et al.  Mining top-k frequent patterns with combination reducing techniques , 2013, Applied Intelligence.

[6]  Gösta Grahne,et al.  Fast algorithms for frequent itemset mining using FP-trees , 2005, IEEE Transactions on Knowledge and Data Engineering.

[7]  Shiwei Tang,et al.  Efficient algorithms for incremental maintenance of closed sequential patterns in large databases , 2009, Data Knowl. Eng..

[8]  Hari Om,et al.  Modified GUIDE (LM) algorithm for mining maximal high utility patterns from data streams , 2015, Int. J. Comput. Intell. Syst..

[9]  Keun Ho Ryu,et al.  IMTAR: Incremental Mining of General Temporal Association Rules , 2010, J. Inf. Process. Syst..

[10]  Unil Yun,et al.  On pushing weight constraints deeply into frequent itemset mining , 2009, Intell. Data Anal..

[11]  Byeong-Soo Jeong,et al.  Mining Regular Patterns in Incremental Transactional Databases , 2010, 2010 12th International Asia-Pacific Web Conference.

[12]  P. S. Grover,et al.  Incremental mining of sequential patterns: Progress and challenges , 2013, Intell. Data Anal..

[13]  Zhi-Hong Deng,et al.  PrePost+: An efficient N-lists-based algorithm for mining frequent itemsets via Children-Parent Equivalence pruning , 2015, Expert Syst. Appl..

[14]  Guimei Liu,et al.  Prequential analysis of complex data with adaptive model reselection , 2009 .

[15]  Keun Ho Ryu,et al.  Efficient frequent pattern mining based on Linear Prefix tree , 2014, Knowl. Based Syst..

[16]  Frans Coenen,et al.  A new method for mining Frequent Weighted Itemsets based on WIT-trees , 2013, Expert Syst. Appl..

[17]  Tzung-Pei Hong,et al.  An incremental mining algorithm for high utility itemsets , 2012, Expert Syst. Appl..

[18]  Sheng-Cheng Yeh,et al.  An online response system for anomaly traffic by incremental mining with genetic optimization , 2010, Journal of Communications and Networks.

[19]  Xing Xie,et al.  Discovering spatio-temporal causal interactions in traffic data streams , 2011, KDD.

[20]  Damla Oguz,et al.  Incremental Itemset Mining Based on Matrix Apriori Algorithm , 2012, DaWaK.

[21]  Guodong Fang,et al.  Network Traffic Monitoring Based on Mining Frequent Patterns , 2009, 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery.

[22]  Ajay Kumar,et al.  An Efficient Approach for Incremental Association Rule Mining through Histogram Matching Technique , 2012, Int. J. Inf. Retr. Res..

[23]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm , 2005, IEEE Transactions on Knowledge and Data Engineering.

[24]  Keith C. C. Chan,et al.  Incremental Fuzzy Mining of Gene Expression Data for Gene Function Prediction , 2011, IEEE Transactions on Biomedical Engineering.

[25]  Keun Ho Ryu,et al.  Efficient mining of maximal correlated weight frequent patterns , 2013, Intell. Data Anal..

[26]  Tzung-Pei Hong,et al.  The Pre-FUFP algorithm for incremental mining , 2009, Expert Syst. Appl..

[27]  Suh-Yin Lee,et al.  DSM-PLW: Single-pass mining of path traversal patterns over streaming Web click-sequences , 2006, Comput. Networks.

[28]  Jian Pei,et al.  PADS: a simple yet effective pattern-aware dynamic search method for fast maximal frequent pattern mining , 2009, Knowledge and Information Systems.

[29]  David Wai-Lok Cheung,et al.  Efficient Algorithms for Mining and Incremental Update of Maximal Frequent Sequences , 2005, Data Mining and Knowledge Discovery.

[30]  Ho-Jin Choi,et al.  Single-pass incremental and interactive mining for weighted frequent patterns , 2012, Expert Syst. Appl..

[31]  Keun Ho Ryu,et al.  Mining maximal frequent patterns by considering weight conditions over data streams , 2014, Knowl. Based Syst..

[32]  Hua-Fu Li,et al.  A sliding window method for finding top-k path traversal patterns over streaming Web click-sequences , 2009, Expert Syst. Appl..

[33]  Ming-Yang Su,et al.  A real-time network intrusion detection system for large-scale attacks based on an incremental mining approach , 2009, Comput. Secur..

[34]  Tzung-Pei Hong,et al.  RWFIM: Recent weighted-frequent itemsets mining , 2015, Eng. Appl. Artif. Intell..

[35]  Zhi-Hong Deng,et al.  Fast mining frequent itemsets using Nodesets , 2014, Expert Syst. Appl..

[36]  Keun Ho Ryu,et al.  Sliding window based weighted maximal frequent pattern mining over data streams , 2014, Expert Syst. Appl..

[37]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[38]  Gillian Dobbie,et al.  Efficient Single Pass Ordered Incremental Pattern Mining , 2013, Trans. Large Scale Data Knowl. Centered Syst..

[39]  Maguelonne Teisseire,et al.  Sequential patterns mining and gene sequence visualization to discover novelty from microarray data , 2011, J. Biomed. Informatics.

[40]  Tzung-Pei Hong,et al.  An incremental mining algorithm for maintaining sequential patterns using pre-large sequences , 2011, Expert Syst. Appl..

[41]  Jiang-hui Cai,et al.  Association rule mining method based on weighted frequent pattern tree in mobile computing environment , 2013, Int. J. Wirel. Mob. Comput..

[42]  Unil Yun,et al.  An Efficient Approach for Mining Weighted Approximate Closed Frequent Patterns Considering Noise Constraints , 2014, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[43]  Keun Ho Ryu,et al.  High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates , 2014, Expert Syst. Appl..

[44]  Keun Ho Ryu,et al.  An efficient mining algorithm for maximal weighted frequent patterns in transactional databases , 2012, Knowl. Based Syst..

[45]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[46]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[47]  Vincent S. Tseng,et al.  A novel prediction-based strategy for object tracking in sensor networks by mining seamless temporal movement patterns , 2010, Expert Syst. Appl..

[48]  R. Vishnu Priya,et al.  Partition-based sorted pre-fix tree construction using global list to mine maximal patterns with incremental and interactive mining , 2012, Int. J. Knowl. Eng. Data Min..

[49]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[50]  Heungmo Ryang,et al.  Incremental high utility pattern mining with static and dynamic databases , 2014, Applied Intelligence.

[51]  Heungmo Ryang,et al.  Mining weighted erasable patterns by using underestimated constraint-based pruning technique , 2015, J. Intell. Fuzzy Syst..

[52]  Don-Lin Yang,et al.  ADMiner: An Incremental Data Mining Approach Using a Compressed FP-tree , 2013, J. Softw..

[53]  Young-Koo Lee,et al.  Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases , 2009, IEEE Transactions on Knowledge and Data Engineering.

[54]  Arbee L. P. Chen,et al.  An Efficient Approach for Incremental Association Rule Mining , 1999, PAKDD.