Discovering high utility itemsets with multiple minimum supports

Generally, association rule mining uses only a single minimum support threshold for the whole database. This model implicitly assumes that all items in the database have the same nature. In real applications, however, each item can have different nature such as medical datasets which contain information of both diseases and symptoms or status related to the diseases. Therefore, association rule mining needs to consider multiple minimum supports. Association rule mining with multiple minimum supports discovers all item rules by reflecting their characteristics. Although this model can identify meaningful association rules including rare item rules, not only the importance of items such as fatality rate of diseases but also attribute of items such as duration of symptoms are not considered since it treats each item with equal importance and represents the occurrences of items in transactions as binary values. In this paper, we propose a novel tree structure, called MHU-Tree (Multiple item supports with High Utility Tree), which is constructed with a single scan. Moreover, we propose an algorithm, named MHU-Growth (Multiple item supports with High Utility Growth), for mining high utility itemsets with multiple minimum supports. Experimental results show that MHU-Growth outperforms the previous algorithm on both real and synthetic datasets, and can discover useful rules from a medical dataset.

[1]  Ho-Jin Choi,et al.  Single-pass incremental and interactive mining for weighted frequent patterns , 2012, Expert Syst. Appl..

[2]  Tzung-Pei Hong,et al.  An improved approach to find membership functions and multiple minimum supports in fuzzy data mining , 2009, Expert Syst. Appl..

[3]  Keun Ho Ryu,et al.  Mining maximal frequent patterns by considering weight conditions over data streams , 2014, Knowl. Based Syst..

[4]  Jacinto Mata Vázquez,et al.  An evolutionary algorithm to discover quantitative association rules from huge databases without the need for an a priori discretization , 2012, Expert Syst. Appl..

[5]  Shih-Sheng Chen,et al.  New and efficient knowledge discovery of partial periodic patterns with multiple minimum supports , 2011, J. Syst. Softw..

[6]  Salvatore J. Stolfo,et al.  Mining Audit Data to Build Intrusion Detection Models , 1998, KDD.

[7]  Philip S. Yu,et al.  Efficient Mining of a Concise and Lossless Representation of High Utility Itemsets , 2011, 2011 IEEE 11th International Conference on Data Mining.

[8]  Ying Liu,et al.  A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets , 2005, PAKDD.

[9]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[10]  Witold Pedrycz,et al.  An improved association rules mining method , 2012, Expert Syst. Appl..

[11]  P. Krishna Reddy,et al.  Novel techniques to reduce search space in multiple minimum supports-based frequent pattern mining algorithms , 2011, EDBT/ICDT '11.

[12]  Keun Ho Ryu,et al.  Efficient frequent pattern mining based on Linear Prefix tree , 2014, Knowl. Based Syst..

[13]  Philip S. Yu,et al.  Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[14]  Vincent S. Tseng,et al.  Discovering relational-based association rules with multiple minimum supports on microarray datasets , 2011, Bioinform..

[15]  Keun Ho Ryu,et al.  High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates , 2014, Expert Syst. Appl..

[16]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[17]  Tzung-Pei Hong,et al.  Genetic-fuzzy mining with multiple minimum supports based on fuzzy clustering , 2011, Soft Comput..

[18]  Juan Manuel Górriz,et al.  Association rule-based feature selection method for Alzheimer's disease diagnosis , 2012, Expert Syst. Appl..

[19]  Keun Ho Ryu,et al.  Sliding window based weighted maximal frequent pattern mining over data streams , 2014, Expert Syst. Appl..

[20]  Tony Cheng-Kui Huang,et al.  Discovery of fuzzy quantitative sequential patterns with multiple minimum supports and adjustable membership functions , 2013, Inf. Sci..

[21]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[22]  Yen-Liang Chen,et al.  Mining association rules with multiple minimum supports: a new mining algorithm and a support tuning mechanism , 2004, Decision Support Systems.

[23]  Keun Ho Ryu,et al.  Efficient mining of maximal correlated weight frequent patterns , 2013, Intell. Data Anal..

[24]  Tzung-Pei Hong,et al.  An incremental mining algorithm for high utility itemsets , 2012, Expert Syst. Appl..

[25]  Tomonobu Ozaki,et al.  Weighted Frequent Subgraph Mining in Weighted Graph Databases , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[26]  Tzung-Pei Hong,et al.  Multi-level fuzzy mining with multiple minimum supports , 2008, Expert Syst. Appl..

[27]  Young-Koo Lee,et al.  Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases , 2009, IEEE Transactions on Knowledge and Data Engineering.

[28]  Wynne Hsu,et al.  Mining association rules with multiple minimum supports , 1999, KDD '99.

[29]  Tzung-Pei Hong,et al.  An effective tree structure for mining high utility itemsets , 2011, Expert Syst. Appl..

[30]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[31]  Fan Wu,et al.  An efficient tree-based algorithm for mining sequential patterns with multiple minimum supports , 2013, J. Syst. Softw..