Efficient representative pattern mining based on weight and maximality conditions

As a core area in data mining, frequent pattern or itemset mining has been studied for a long time. Weighted frequent pattern mining prunes unimportant patterns and maximal frequent pattern mining discovers compact frequent patterns. These approaches contribute to improving mining performance by reducing the search space. However, we need to consider both the downward closure property and patterns' subset checking process when integrating these different methods in order to prevent unintended pattern losses. Moreover, it is also essential to extract valid patterns with faster runtime and less memory consumption. For this reason, in this paper, we propose more efficient maximal weighted frequent pattern MWFP mining approaches based on tree and array structures. We describe how to handle these problems more efficiently, maintaining the correctness of our method. We develop two types of maximal weighted frequent mining algorithms based on weight ascending order and support descending order and compare these two algorithms to conclude which is more suitable for MWFP mining. In addition, comprehensive tests in this paper show that our algorithms are more efficient and scalable than state-of-the-art algorithms, and they also have the correctness of the MWFP mining in terms of their pattern generation results.

[1]  Jianzhong Li,et al.  Finding top-k maximal cliques in an uncertain graph , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[2]  Heungmo Ryang,et al.  Incremental high utility pattern mining with static and dynamic databases , 2014, Applied Intelligence.

[3]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm , 2005, IEEE Transactions on Knowledge and Data Engineering.

[4]  Keun Ho Ryu,et al.  Efficient frequent pattern mining based on Linear Prefix tree , 2014, Knowl. Based Syst..

[5]  Philip S. Yu,et al.  Efficient algorithms for mining maximal high utility itemsets from data streams with different models , 2012, Expert Syst. Appl..

[6]  Keun Ho Ryu,et al.  An efficient mining algorithm for maximal weighted frequent patterns in transactional databases , 2012, Knowl. Based Syst..

[7]  Zeshui Xu,et al.  Intuitionistic Fuzzy Clustering Algorithm Based on Boole Matrix and Association Measure , 2013, Int. J. Inf. Technol. Decis. Mak..

[8]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[9]  Hiroki Arimura,et al.  LCM ver. 2: Efficient Mining Algorithms for Frequent/Closed/Maximal Itemsets , 2004, FIMI.

[10]  Jian Pei,et al.  CLOSET+: searching for the best strategies for mining frequent closed itemsets , 2003, KDD '03.

[11]  Keun Ho Ryu,et al.  Fast algorithm for high utility pattern mining with the sum of item quantities , 2016, Intell. Data Anal..

[12]  Gösta Grahne,et al.  Fast algorithms for frequent itemset mining using FP-trees , 2005, IEEE Transactions on Knowledge and Data Engineering.

[13]  Soon Myoung Chung,et al.  Parallel mining of maximal sequential patterns using multiple samples , 2010, The Journal of Supercomputing.

[14]  Jeong Hee Hwang,et al.  Mining the Weighted Frequent XML Query Pattern , 2008, 2008 IEEE International Workshop on Semantic Computing and Applications.

[15]  Yong Chen,et al.  A New Approach for Maximal Frequent Sequential Patterns Mining Over Data Streams , 2011 .

[16]  R. V. Nataraj,et al.  Efficient Mining of Large Maximal Bicliques from 3D Symmetric Adjacency Matrix , 2010, IEEE Transactions on Knowledge and Data Engineering.

[17]  Anthony J. T. Lee,et al.  A data mining approach to face detection , 2010, Pattern Recognit..

[18]  Jian Pei,et al.  PADS: a simple yet effective pattern-aware dynamic search method for fast maximal frequent pattern mining , 2009, Knowledge and Information Systems.

[19]  Jie Wang,et al.  DSWFP: Efficient mining of weighted frequent pattern over data streams , 2011, 2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD).

[20]  Lihua Zhang,et al.  Efficient Mining Maximal Variant Usage and Low Usage Biclusters in Discrete Function-Resource Matrix , 2014, J. Comput..

[21]  Ho-Jin Choi,et al.  Single-pass incremental and interactive mining for weighted frequent patterns , 2012, Expert Syst. Appl..

[22]  Keun Ho Ryu,et al.  Mining maximal frequent patterns by considering weight conditions over data streams , 2014, Knowl. Based Syst..

[23]  Keun Ho Ryu,et al.  Approximate weighted frequent pattern mining with/without noisy environments , 2011, Knowl. Based Syst..

[24]  Lokesh Kumar Sharma,et al.  Efficient Frequent Pattern Mining in Web Log , 2010, ICCA 2010.

[25]  Unil Yun,et al.  On pushing weight constraints deeply into frequent itemset mining , 2009, Intell. Data Anal..

[26]  Woontack Woo,et al.  Multiple-Criteria Decision-Making Based on Probabilistic Estimation with Contextual Information for Physiological Signal Monitoring , 2011, Int. J. Inf. Technol. Decis. Mak..

[27]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[28]  Heungmo Ryang,et al.  Multiple Minimum Support-Based Rare Graph Pattern Mining Considering Symmetry Feature-Based Growth Technique and the Differing Importance of Graph Elements , 2015, Symmetry.

[29]  Pauray S. M. Tsai,et al.  Mining frequent itemsets in data streams using the weighted sliding window model , 2009, Expert Syst. Appl..

[30]  Keun Ho Ryu,et al.  Sliding window based weighted maximal frequent pattern mining over data streams , 2014, Expert Syst. Appl..

[31]  Philip Yu,et al.  WAR: Weighted association rules for item intensities , 2007, Knowledge and Information Systems.

[32]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[33]  Unil Yun,et al.  A fast perturbation algorithm using tree structure for privacy preserving utility mining , 2015, Expert Syst. Appl..

[34]  Heungmo Ryang,et al.  Mining weighted erasable patterns by using underestimated constraint-based pruning technique , 2015, J. Intell. Fuzzy Syst..

[35]  Hui Xiong,et al.  Exploiting a support-based upper bound of Pearson's correlation coefficient for efficiently identifying strongly correlated pairs , 2004, KDD.

[36]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[37]  Heungmo Ryang,et al.  Top-k high utility pattern mining with effective threshold raising strategies , 2015, Knowl. Based Syst..

[38]  Ke Sun,et al.  Mining Weighted Association Rules without Preassigned Weights , 2008, IEEE Transactions on Knowledge and Data Engineering.

[39]  Yuhui Qiu,et al.  Cooperative Recommendation System Based on Ontology Construction , 2008, 2008 Seventh International Conference on Grid and Cooperative Computing.

[40]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[41]  Heungmo Ryang,et al.  An uncertainty-based approach: Frequent itemset mining from uncertain data with different item importance , 2015, Knowl. Based Syst..

[42]  Won Suk Lee,et al.  estMax: Tracing Maximal Frequent Item Sets Instantly over Online Transactional Data Streams , 2009, IEEE Transactions on Knowledge and Data Engineering.

[43]  Long-Sheng Chen,et al.  Developing recommender systems with the consideration of product profitability for sellers , 2008, Inf. Sci..

[44]  Jeong Hee Hwang,et al.  A weighted common structure based clustering technique for XML documents , 2010, J. Syst. Softw..

[45]  Unil Yun,et al.  Efficient Mining of Robust Closed Weighted Sequential Patterns Without Information Loss , 2015, Int. J. Artif. Intell. Tools.

[46]  Miao Wang,et al.  MFC: Mining Maximal Frequent Dense Subgraphs without Candidate Maintenance in Imbalanced PPI Networks , 2011, J. Softw..

[47]  R. S. Thakur,et al.  Maximal Pattern Mining Using Fast CP-Tree for Knowledge Discovery , 2012, Int. J. Inf. Syst. Soc. Chang..

[48]  Unil Yun,et al.  Mining lossless closed frequent patterns with weight constraints , 2007, Knowl. Based Syst..

[49]  Francesco Bonchi,et al.  On closed constrained frequent pattern mining , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[50]  Joong Hyuk Chang,et al.  Mining weighted sequential patterns in a sequence database with a time-interval weight , 2011, Knowl. Based Syst..

[51]  Fionn Murtagh,et al.  Weighted Association Rule Mining using weighted support and significance framework , 2003, KDD '03.

[52]  Hiroki Arimura,et al.  LCM ver.3: collaboration of array, bitmap and prefix tree for frequent itemset mining , 2005 .

[53]  Mong-Li Lee,et al.  Incremental Mining of Top-k Maximal Influential Paths in Network Data , 2013, Trans. Large Scale Data Knowl. Centered Syst..

[54]  Keun Ho Ryu,et al.  Discovering high utility itemsets with multiple minimum supports , 2014, Intell. Data Anal..