An Efficient Method for Mining Closed Potential High-Utility Itemsets

High-utility itemset mining (HUIM) has become a key phase of the pattern mining process, which has wide applications, related to both quantities and profits of items. Many algorithms have been proposed to mine high-utility itemsets (HUIs). Since these algorithms often return a large number of discovered patterns, a more compact and lossless representation has been proposed. The recently proposed closed high utility itemset mining (CHUIM) algorithms were designed to work with certain types of databases (e.g., those without probabilities). In fact, real-world databases might contain items or itemsets associated with probability values. To effectively mine frequent patterns from uncertain databases, several techniques have been developed, but there does not exist any method for mining CHUIs from this type of databases. This work presents a novel and efficient method without generating candidates, named CPHUI-List, to mine closed potential high-utility itemsets (CPHUIs) from uncertain databases. The proposed algorithm is DFS-based and utilizes the downward closure property of high transaction-weighted probabilistic mining to prune non-CPHUIs. It can be seen from the experiment evaluations that the proposed algorithm has better execution time and memory usage than the CHUI-Miner.

[1]  Ben Kao,et al.  A Decremental Approach for Mining Frequent Itemsets from Uncertain Data , 2008, PAKDD.

[2]  Lin Feng,et al.  AT-Mine: An Efficient Algorithm of Frequent Itemset Mining on Uncertain Dataset , 2013, J. Comput..

[3]  Carson Kai-Sang Leung,et al.  A Tree-Based Approach for Frequent Pattern Mining from Uncertain Data , 2008, PAKDD.

[4]  Ying Liu,et al.  A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets , 2005, PAKDD.

[5]  Philip S. Yu,et al.  Mining Frequent Itemsets over Uncertain Databases , 2012, Proc. VLDB Endow..

[6]  Philip S. Yu,et al.  Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[7]  Vincent S. Tseng,et al.  Mining closed+ high utility itemsets without candidate generation , 2015, 2015 Conference on Technologies and Applications of Artificial Intelligence (TAAI).

[8]  Tzung-Pei Hong,et al.  Effective utility mining with the measure of average utility , 2011, Expert Syst. Appl..

[9]  Nick Cercone,et al.  Mining Market Basket Data Using Share Measures and Characterized Itemsets , 1998, PAKDD.

[10]  Philip S. Yu,et al.  Efficient Algorithms for Mining the Concise and Lossless Representation of High Utility Itemsets , 2015, IEEE Transactions on Knowledge and Data Engineering.

[11]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[12]  Van-Nam Huynh,et al.  Mining closed high utility itemsets in uncertain databases , 2016, SoICT.

[13]  Jerry Chun-Wei Lin,et al.  EHAUPM: Efficient High Average-Utility Pattern Mining With Tighter Upper Bounds , 2017, IEEE Access.

[14]  Mengchi Liu,et al.  Fast Identification of High Utility Itemsets from Candidates , 2018, Inf..

[15]  Philip S. Yu,et al.  UP-Growth: an efficient algorithm for high utility itemset mining , 2010, KDD.

[16]  Mengchi Liu,et al.  Mining high utility itemsets without candidate generation , 2012, CIKM.

[17]  Tzung-Pei Hong,et al.  Efficient Mining of High Average-Utility Itemsets with Multiple Minimum Thresholds , 2016, ICDM.

[18]  Antonio Gomariz,et al.  SPMF: a Java open-source pattern mining library , 2014, J. Mach. Learn. Res..

[19]  Vincent S. Tseng,et al.  FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning , 2014, ISMIS.

[20]  Tzung-Pei Hong,et al.  An efficient projection-based indexing approach for mining high utility itemsets , 2012, Knowledge and Information Systems.

[21]  Dan Suciu,et al.  Efficient query evaluation on probabilistic databases , 2004, The VLDB Journal.

[22]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[23]  Jimmy Ming-Tai Wu,et al.  TUB-HAUPM: Tighter Upper Bound for Mining High Average-Utility Patterns , 2018, IEEE Access.

[24]  Philip S. Yu,et al.  A Survey of Uncertain Data Algorithms and Applications , 2009, IEEE Transactions on Knowledge and Data Engineering.

[25]  Hans-Peter Kriegel,et al.  Probabilistic frequent itemset mining in uncertain databases , 2009, KDD.

[26]  Tzung-Pei Hong,et al.  A New Method for Mining High Average Utility Itemsets , 2014, CISIM.

[27]  Vincent S. Tseng,et al.  EFIM: a fast and memory efficient algorithm for high-utility itemset mining , 2016, Knowledge and Information Systems.

[28]  Tzung-Pei Hong,et al.  A new mining approach for uncertain databases using CUFP trees , 2012, Expert Syst. Appl..

[29]  Carson Kai-Sang Leung,et al.  DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams , 2006, Sixth International Conference on Data Mining (ICDM'06).

[30]  Yu Liu,et al.  BAHUI: Fast and Memory Efficient Mining of High Utility Itemsets Based on Bitmap , 2014, Int. J. Data Warehous. Min..

[31]  Carson Kai-Sang Leung,et al.  PUF-Tree: A Compact Tree Structure for Frequent Pattern Mining of Uncertain Data , 2013, PAKDD.

[32]  Tzung-Pei Hong,et al.  Efficient algorithms for mining high-utility itemsets in uncertain databases , 2016, Knowl. Based Syst..

[33]  Reynold Cheng,et al.  Mining uncertain data with probabilistic guarantees , 2010, KDD.

[34]  Jerry Chun-Wei Lin,et al.  Maintenance of discovered high average-utility itemsets in dynamic databases , 2018 .

[35]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[36]  Edward Hung,et al.  Mining Frequent Itemsets from Uncertain Data , 2007, PAKDD.

[37]  Srikumar Krishnamoorthy,et al.  Pruning strategies for mining high utility itemsets , 2015, Expert Syst. Appl..

[38]  Tzung-Pei Hong,et al.  Efficiently Mining High Average Utility Itemsets with a Tree Structure , 2010, ACIIDS.