FHUQI-Miner: Fast high utility quantitative itemset mining

High utility itemset mining is a popular pattern mining task, which aims at revealing all sets of items that yield a high profit in a transaction database. Although this task is useful to understand customer behavior, an important limitation is that high utility itemsets do not provide information about the purchase quantities of items. Recently, some algorithms were designed to address this issue by finding quantitative high utility itemsets but they can have very long execution times due to the larger search space. This paper addresses this issue by proposing a novel efficient algorithm for high utility quantitative itemset mining, called FHUQI-Miner (Fast High Utility Quantitative Itemset Miner). It performs a depth-first search and adopts two novel search space reduction strategies, named Exact Q-items Co-occurrence Pruning Strategy (EQCPS) and Range Q-items Co-occurrence Pruning Strategy (RQCPS). Experimental results show that the proposed algorithm is much faster than the state-of-art HUQI-Miner algorithm on sparse datasets.

[1]  Mohammad Teshnehlab,et al.  negFIN: An efficient algorithm for fast mining frequent itemsets , 2018, Expert Syst. Appl..

[2]  Keun Ho Ryu,et al.  High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates , 2014, Expert Syst. Appl..

[3]  Philippe Fournier-Viger,et al.  PHM: Mining Periodic High-Utility Itemsets , 2016, ICDM.

[4]  Qiang Yang,et al.  Mining high utility itemsets , 2003, Third IEEE International Conference on Data Mining.

[5]  Tzung-Pei Hong,et al.  An effective tree structure for mining high utility itemsets , 2011, Expert Syst. Appl..

[6]  Mengchi Liu,et al.  Mining high utility itemsets without candidate generation , 2012, CIKM.

[7]  Yue-Shi Lee,et al.  Mining High Utility Quantitative Association Rules , 2007, DaWaK.

[8]  Srikumar Krishnamoorthy,et al.  HMiner: Efficiently mining high utility itemsets , 2017, Expert Syst. Appl..

[9]  Jian Chen,et al.  Efficiently Mining Frequent Itemsets on Massive Data , 2019, IEEE Access.

[10]  Yi-Cheng Chen,et al.  On efficiently mining high utility sequential patterns , 2016, Knowledge and Information Systems.

[11]  Jerry Chun-Wei Lin,et al.  A Survey of High Utility Itemset Mining , 2019, Studies in Big Data.

[12]  Sebastián Ventura,et al.  Frequent itemset mining: A 25 years review , 2019, WIREs Data Mining Knowl. Discov..

[13]  Benjamin C. M. Fung,et al.  Direct Discovery of High Utility Itemsets without Candidate Generation , 2012, 2012 IEEE 12th International Conference on Data Mining.

[14]  Roque Marín,et al.  ClaSP: An Efficient Algorithm for Mining Frequent Closed Sequences , 2013, PAKDD.

[15]  Rui Sun,et al.  A Survey of Key Technologies for High Utility Patterns Mining , 2020, IEEE Access.

[16]  Philippe Fournier-Viger,et al.  HUE-Span: Fast High Utility Episode Mining , 2019, ADMA.

[17]  Vincent S. Tseng,et al.  An Efficient Algorithm for Mining High Utility Quantitative Itemsets , 2019, 2019 International Conference on Data Mining Workshops (ICDMW).

[18]  Philip S. Yu,et al.  A Survey of Utility-Oriented Pattern Mining , 2018, IEEE Transactions on Knowledge and Data Engineering.

[19]  Ying Liu,et al.  A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets , 2005, PAKDD.

[20]  Philippe Fournier-Viger,et al.  A survey of itemset mining , 2017, WIREs Data Mining Knowl. Discov..

[21]  Heri Ramampiaro,et al.  Efficient high utility itemset mining using buffered utility-lists , 2017, Applied Intelligence.

[22]  Philip S. Yu,et al.  UP-Growth: an efficient algorithm for high utility itemset mining , 2010, KDD.

[23]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[24]  Vincent S. Tseng,et al.  Discovering utility-based episode rules in complex event sequences , 2015, Expert Syst. Appl..

[25]  Vincent S. Tseng,et al.  EFIM: a fast and memory efficient algorithm for high-utility itemset mining , 2016, Knowledge and Information Systems.

[26]  Philip S. Yu,et al.  Mining high utility episodes in complex event sequences , 2013, KDD.

[27]  Van-Nam Huynh,et al.  An efficient algorithm for mining periodic high-utility sequential patterns , 2018, Applied Intelligence.

[28]  Vincent S. Tseng,et al.  Efficient vertical mining of high utility quantitative itemsets , 2014, 2014 IEEE International Conference on Granular Computing (GrC).

[29]  Vincent S. Tseng,et al.  FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning , 2014, ISMIS.

[30]  Yin-Fu Huang,et al.  A fuzzy approach for mining high utility quantitative itemsets , 2009, 2009 IEEE International Conference on Fuzzy Systems.

[31]  Philippe Fournier-Viger,et al.  Efficient Algorithms for High Utility Itemset Mining Without Candidate Generation , 2019, Studies in Big Data.

[32]  Byeong-Soo Jeong,et al.  A Novel Approach for Mining High‐Utility Sequential Patterns in Sequence Databases , 2010 .

[33]  Antonio Gomariz,et al.  SPMF: a Java open-source pattern mining library , 2014, J. Mach. Learn. Res..

[34]  Unil Yun,et al.  Efficient incremental high utility pattern mining based on pre-large concept , 2018, Eng. Appl. Artif. Intell..

[35]  Philippe Fournier-Viger,et al.  A Survey of High Utility Sequential Pattern Mining , 2019, Studies in Big Data.

[36]  Xindong Wu,et al.  Fundamentals of association rules in data mining and knowledge discovery , 2011, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..

[37]  Yun Sing Koh,et al.  mHUIMiner: A Fast High Utility Itemset Mining Algorithm for Sparse Datasets , 2017, PAKDD.