Efficient mining of high utility pattern with considering of rarity and length

Techniques for mining rare patterns have been researched in the association rule mining area because traditional frequent pattern mining methods have to generate a large amount of unnecessary patterns in order to find rare patterns from large databases. One such technique, the multiple minimum support threshold framework was devised to extract rare patterns by using a different minimum item support threshold for each item in a database. Nevertheless, this framework cannot sufficiently reflect environments of the real world. The reason is that it does not consider weights of items, such as market prices of products and fatality rates of diseases, in its mining process. Therefore, an algorithm has been proposed to mine rare patterns with utilities exceeding a user-specified minimum utility by considering rarity and utility information of items. However, since this algorithm employs the concept of traditional high utility pattern mining, patterns’ lengths are not considered for determining utilities of the patterns. If the length of a pattern is sufficiently long, the pattern is more likely to have an enough utility to become a high utility pattern regardless of item utilities within the pattern. Therefore, the algorithm cannot guarantee that all items in a mined pattern have high utilities. In this paper, we propose a novel algorithm that effectively reduces such dependency of patterns on their lengths by considering their lengths in the mining process in order to mine more meaningful rare patterns compared to patterns mined by previous algorithms. Experimental results demonstrate that our algorithm extracts a lesser number of more meaningful patterns and consumes less computational resources compared to state-of-the-art algorithms.

[1]  Tzung-Pei Hong,et al.  An Incremental Mining Algorithm for High Average-Utility Itemsets , 2009, 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks.

[2]  Tzung-Pei Hong,et al.  Effective utility mining with the measure of average utility , 2011, Expert Syst. Appl..

[3]  Unil Yun,et al.  An Efficient Approach for Mining Weighted Approximate Closed Frequent Patterns Considering Noise Constraints , 2014, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[4]  Wynne Hsu,et al.  Mining association rules with multiple minimum supports , 1999, KDD '99.

[5]  Mengchi Liu,et al.  Mining high utility itemsets without candidate generation , 2012, CIKM.

[6]  P. Krishna Reddy,et al.  An improved multiple minimum support based approach to mine rare association rules , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[7]  Young-Koo Lee,et al.  Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases , 2009, IEEE Transactions on Knowledge and Data Engineering.

[8]  P. Krishna Reddy,et al.  Novel techniques to reduce search space in multiple minimum supports-based frequent pattern mining algorithms , 2011, EDBT/ICDT '11.

[9]  Tzung-Pei Hong,et al.  An efficient projection-based indexing approach for mining high utility itemsets , 2012, Knowledge and Information Systems.

[10]  Keun Ho Ryu,et al.  Sliding window based weighted maximal frequent pattern mining over data streams , 2014, Expert Syst. Appl..

[11]  Keun Ho Ryu,et al.  High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates , 2014, Expert Syst. Appl..

[12]  Tzung-Pei Hong,et al.  A Projection-Based Approach for Discovering High Average-Utility Itemsets , 2012, J. Inf. Sci. Eng..

[13]  Ming-Yen Lin,et al.  High utility pattern mining using the maximal itemset property and lexicographic tree structures , 2012, Inf. Sci..

[14]  Gösta Grahne,et al.  Fast algorithms for frequent itemset mining using FP-trees , 2005, IEEE Transactions on Knowledge and Data Engineering.

[15]  Salvatore J. Stolfo,et al.  Mining Audit Data to Build Intrusion Detection Models , 1998, KDD.

[16]  Heungmo Ryang,et al.  Incremental high utility pattern mining with static and dynamic databases , 2014, Applied Intelligence.

[17]  Tzung-Pei Hong,et al.  Mining high utility itemsets for transaction deletion in a dynamic database , 2015, Intell. Data Anal..

[18]  Keun Ho Ryu,et al.  Mining association rules on significant rare data using relative support , 2003, J. Syst. Softw..

[19]  Tzung-Pei Hong,et al.  Mining high average-utility itemsets , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[20]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[21]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[22]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[23]  Xiangjun Dong,et al.  Mining frequent patterns with multiple minimum supports using basic Apriori , 2013, 2013 Ninth International Conference on Natural Computation (ICNC).

[24]  Chih-Fong Tsai,et al.  A novel approach for mining cyclically repeated patterns with multiple minimum supports , 2015, Appl. Soft Comput..

[25]  Yen-Liang Chen,et al.  Mining association rules with multiple minimum supports: a new mining algorithm and a support tuning mechanism , 2004, Decision Support Systems.

[26]  Heungmo Ryang,et al.  An uncertainty-based approach: Frequent itemset mining from uncertain data with different item importance , 2015, Knowl. Based Syst..

[27]  Tzung-Pei Hong,et al.  A New Method for Mining High Average Utility Itemsets , 2014, CISIM.

[28]  Ying Liu,et al.  A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets , 2005, PAKDD.

[29]  Keun Ho Ryu,et al.  Discovering high utility itemsets with multiple minimum supports , 2014, Intell. Data Anal..

[30]  Unil Yun,et al.  Mining top-k frequent patterns with combination reducing techniques , 2013, Applied Intelligence.

[31]  Philip S. Yu,et al.  UP-Growth: an efficient algorithm for high utility itemset mining , 2010, KDD.

[32]  Cheng-Hsiung Weng,et al.  Mining fuzzy specific rare itemsets for education data , 2011, Knowl. Based Syst..

[33]  Jutamas Tempaiboolkul Mining rare association rules in a distributed environment using multiple minimum supports , 2013, 2013 IEEE/ACIS 12th International Conference on Computer and Information Science (ICIS).

[34]  Tzung-Pei Hong,et al.  Efficiently Mining High Average-Utility Itemsets with an Improved Upper-Bound Strategy , 2012, Int. J. Inf. Technol. Decis. Mak..

[35]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[36]  Tzung-Pei Hong,et al.  Efficiently Mining High Average Utility Itemsets with a Tree Structure , 2010, ACIIDS.

[37]  Bay Vo,et al.  An efficient and effective algorithm for mining top-rank-k frequent patterns , 2015, Expert Syst. Appl..

[38]  Heungmo Ryang,et al.  Top-k high utility pattern mining with effective threshold raising strategies , 2015, Knowl. Based Syst..