High Utility Item-set Mining from retail market data stream with various discount strategies using EGUI-tree

High Utility Item-set Mining (HUIM) is the futuristic remodel version of Frequent Item-set Mining (FIM). It discovers customer purchase trends in the retail market. This knowledge is useful to retailers to incorporate various innovative schemes in their businesses to attract the customers such as discounts, cross-marketing, seasonal sale offers…etc. Even though many HUIM algorithms are available to detect profitable patterns, most of them cannot apply to all kinds of retail market data sets due to certain assumptions. The first assumption is that the items always produce a positive profit. Even though purchased items’ overall profit could be positive, few items may have negative profit. Another assumption is they are built for static transactional data. The data is gathered up to the point of time and is used for analysis. It is helpful to make decisions at some intervals like quarterly, half-yearly, yearly. But, to take decisions at any time by analyzing the present sales trend, it is required to process the data stream. This paper presents an innovative idea named Extended Global Utility Item-sets Tree(EGUI-tree) to extract High utility item-sets in the retail market data stream with positive and negative profit items. The sliding window-based technique is applied to the data stream to pick up the very recent data to process. An experimental study on real-world datasets shows that the proposed EGUI-tree algorithm is faster and scalable.

[1]  Philippe Fournier-Viger,et al.  FHN: An efficient algorithm for mining high-utility itemsets with negative unit profits , 2016, Knowl. Based Syst..

[2]  Bhabesh Nath,et al.  Rare pattern mining: challenges and future perspectives , 2018, Complex & Intelligent Systems.

[3]  Philippe Fournier-Viger,et al.  A Survey of High Utility Sequential Pattern Mining , 2019, Studies in Big Data.

[4]  Kuldeep Singh,et al.  Mining of high‐utility itemsets with negative utility , 2018, Expert Syst. J. Knowl. Eng..

[5]  Philip S. Yu,et al.  A Survey of Utility-Oriented Pattern Mining , 2018, IEEE Transactions on Knowledge and Data Engineering.

[6]  Oznur Alkan,et al.  High-Utility Pattern Mining , 2019, Studies in Big Data.

[7]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[8]  Jerry Chun-Wei Lin,et al.  A Survey of High Utility Itemset Mining , 2019, Studies in Big Data.

[9]  Srikumar Krishnamoorthy,et al.  Pruning strategies for mining high utility itemsets , 2015, Expert Syst. Appl..

[10]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[11]  Suh-Yin Lee,et al.  Fast and memory efficient mining of high-utility itemsets from data streams: with and without negative item profits , 2011, Knowledge and Information Systems.

[12]  M. Subbareddy,et al.  Efficient Algorithms for Mining Top-K High Utility Item sets , 2018 .

[13]  Unil Yun,et al.  Efficient High Utility Pattern Mining for Establishing Manufacturing Plans With Sliding Window Control , 2017, IEEE Transactions on Industrial Electronics.

[14]  Vikram Goyal,et al.  An Efficient Algorithm for Mining High-Utility Itemsets with Discount Notion , 2015, BDA.

[15]  Jian Pei State of the Journal Editorial , 2016, IEEE Trans. Knowl. Data Eng..

[16]  Tzung-Pei Hong,et al.  Mining high-utility itemsets with various discount strategies , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[17]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[18]  Ruoming Jin,et al.  Frequent Pattern Mining in Data Streams , 2014, Frequent Pattern Mining.

[19]  Wanwan Wang,et al.  An empirical evaluation of high utility itemset mining algorithms , 2018, Expert Syst. Appl..

[20]  Kuldeep Singh,et al.  High utility itemsets mining with negative utility value: A survey , 2018, J. Intell. Fuzzy Syst..

[21]  Rui Sun,et al.  A Survey of Key Technologies for High Utility Patterns Mining , 2020, IEEE Access.