An Efficient Approach for Mining High Utility Itemsets Over Data Streams

Mining frequent itemsets only considers the number of the occurrences of the itemsets in the transaction database. Mining high utility itemsets considers the purchased quantities and the profits of the itemsets in the transactions, which the profitable products can be found. In addition, the transactions will continuously increase over time, such that the size of the database becomes larger and larger. Furthermore, the older transactions which cannot represent the current user behaviors also need to be removed. The environment to continuously add and remove transactions over time is called a data stream . When the transactions are added or deleted, the original high utility itemsets will be changed. The previous proposed algorithms for mining high utility itemsets over data streams need to rescan the original database and generate a large number of candidate high utility itemsets without using the previously discovered high utility itemsets. Therefore, this chapter proposes an approach for efficiently mining high utility itemsets over data streams. When the transactions are added into or removed from the transaction database, our algorithm does not need to scan the original transaction database and search from a large number of candidate itemsets. Experimental results also show that our algorithm outperforms the previous approaches.

[1]  Philip S. Yu,et al.  Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[2]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[3]  Heungmo Ryang,et al.  High utility pattern mining over data streams with sliding window technique , 2016, Expert Syst. Appl..

[4]  Chin-Chen Chang,et al.  Isolated items discarding strategy for discovering high utility itemsets , 2008, Data Knowl. Eng..

[5]  Young-Koo Lee,et al.  An Efficient Candidate Pruning Technique for High Utility Pattern Mining , 2009, PAKDD.

[6]  Philip S. Yu,et al.  UP-Growth: an efficient algorithm for high utility itemset mining , 2010, KDD.

[7]  Aijun An,et al.  Mining top-k high utility patterns over data streams , 2014, Inf. Sci..

[8]  Suh-Yin Lee,et al.  Fast and Memory Efficient Mining of High Utility Itemsets in Data Streams , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[9]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[10]  Yue-Shi Lee,et al.  A fast algorithm for mining frequent closed itemsets over stream sliding window , 2011, 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011).

[11]  S. Jayanthi,et al.  A Fast Algorithm for Mining High Utility Itemsets , 2009, 2009 IEEE International Advance Computing Conference.

[12]  A. Choudhary,et al.  A fast high utility itemsets mining algorithm , 2005, UBDM '05.

[13]  Heungmo Ryang,et al.  Incremental high utility pattern mining with static and dynamic databases , 2014, Applied Intelligence.

[14]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[15]  Vincent S. Tseng,et al.  Efficient Mining of Temporal High Utility Itemsets from Data streams , 2006 .

[16]  Osmar R. Zaïane,et al.  COFI approach for mining frequent itemsets revisited , 2004, DMKD '04.

[17]  Tzung-Pei Hong,et al.  Mining high utility itemsets for transaction deletion in a dynamic database , 2015, Intell. Data Anal..

[18]  Young-Koo Lee,et al.  Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases , 2009, IEEE Transactions on Knowledge and Data Engineering.

[19]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[20]  Jian Pei,et al.  CLOSET+: searching for the best strategies for mining frequent closed itemsets , 2003, KDD '03.

[21]  Show-Jane Yen,et al.  A Search Space Reduced Algorithm for Mining Frequent Patterns , 2012, J. Inf. Sci. Eng..