Damped window based high average utility pattern mining over data streams

Abstract Data mining methods have been required in both commercial and non-commercial areas. In such circumstances, pattern mining techniques can be used to find meaningful pattern information. Utility pattern mining (UPM) is more suitable for evaluating the usefulness of patterns. The method introduced in this paper employs the high average utility pattern mining (HAUPM) approach, which is one of the UPM approaches and discovers interesting patterns of which the items have more meaningful relations among one another by using a novel utility measure. Meanwhile, past research on pattern mining algorithms mainly focus on mining tasks processing static database such as batch operations. Most continuous, unbounded stream data such as data constantly produced from heart beat sensors should be treated differently with respect to importance because up-to-date data may have higher influence than old data. Therefore, our approach also adopts the concept of the damped window model to gain more useful patterns in stream environments. Various experiments are performed on real datasets in order to demonstrate that the designed method not only provides important, recent pattern information but also requires less computational resources such as execution time, memory usage, scalability and significant test.

[1]  Unil Yun,et al.  Mining of high average-utility itemsets using novel list structure and pruning strategy , 2017, Future Gener. Comput. Syst..

[2]  Srikumar Krishnamoorthy,et al.  Pruning strategies for mining high utility itemsets , 2015, Expert Syst. Appl..

[3]  Tzung-Pei Hong,et al.  An efficient algorithm to mine high average-utility itemsets , 2016, Adv. Eng. Informatics.

[4]  Yu Liu,et al.  BAHUI: Fast and Memory Efficient Mining of High Utility Itemsets Based on Bitmap , 2014, Int. J. Data Warehous. Min..

[5]  Keun Ho Ryu,et al.  Sliding window based weighted maximal frequent pattern mining over data streams , 2014, Expert Syst. Appl..

[6]  Unil Yun,et al.  Efficient algorithm for mining high average-utility itemsets in incremental transaction databases , 2017, Applied Intelligence.

[7]  Yinglin Wang,et al.  CCSpan: Mining closed contiguous sequential patterns , 2015, Knowl. Based Syst..

[8]  Unil Yun,et al.  Mining top-k frequent patterns with combination reducing techniques , 2013, Applied Intelligence.

[9]  Heungmo Ryang,et al.  Top-k high utility pattern mining with effective threshold raising strategies , 2015, Knowl. Based Syst..

[10]  Unil Yun,et al.  Efficient mining of high utility pattern with considering of rarity and length , 2015, Applied Intelligence.

[11]  Justin Zhijun Zhan,et al.  An ACO-based approach to mine high-utility itemsets , 2017, Knowl. Based Syst..

[12]  Unil Yun,et al.  Mining high utility itemsets based on the time decaying model , 2016, Intell. Data Anal..

[13]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[14]  Tzung-Pei Hong,et al.  Mining high average-utility itemsets , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[15]  Xiong Zhang,et al.  Mining summarization of high utility itemsets , 2015, Knowl. Based Syst..

[16]  Philip S. Yu,et al.  Efficient algorithms for mining maximal high utility itemsets from data streams with different models , 2012, Expert Syst. Appl..

[17]  Ling Chen,et al.  Mining frequent items in data stream using time fading model , 2014, Inf. Sci..

[18]  Jerry Chun-Wei Lin,et al.  EHAUPM: Efficient High Average-Utility Pattern Mining With Tighter Upper Bounds , 2017, IEEE Access.

[19]  Philippe Fournier-Viger,et al.  An efficient algorithm for mining the top-k high utility itemsets, using novel threshold raising and pruning strategies , 2016, Knowl. Based Syst..

[20]  Unil Yun,et al.  The Smallest Valid Extension-Based Efficient, Rare Graph Pattern Mining, Considering Length-Decreasing Support Constraints and Symmetry Characteristics of Graphs , 2016, Symmetry.

[21]  Tzung-Pei Hong,et al.  Efficiently Mining High Average-Utility Itemsets with an Improved Upper-Bound Strategy , 2012, Int. J. Inf. Technol. Decis. Mak..

[22]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[23]  Heungmo Ryang,et al.  Erasable itemset mining over incremental databases with weight conditions , 2016, Eng. Appl. Artif. Intell..

[24]  Heungmo Ryang,et al.  Mining weighted erasable patterns by using underestimated constraint-based pruning technique , 2015, J. Intell. Fuzzy Syst..

[25]  O. Mangasarian,et al.  Robust linear programming discrimination of two linearly inseparable sets , 1992 .

[26]  Unil Yun,et al.  An Efficient Approach for Mining Weighted Approximate Closed Frequent Patterns Considering Noise Constraints , 2014, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[27]  Vincent S. Tseng,et al.  EFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining , 2015, MICAI.

[28]  Unil Yun,et al.  Efficient representative pattern mining based on weight and maximality conditions , 2016, Expert Syst. J. Knowl. Eng..

[29]  Heungmo Ryang,et al.  Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques , 2017, Knowledge and Information Systems.

[30]  Mengchi Liu,et al.  Mining high utility itemsets without candidate generation , 2012, CIKM.

[31]  Philippe Fournier-Viger,et al.  FHN: An efficient algorithm for mining high-utility itemsets with negative unit profits , 2016, Knowl. Based Syst..

[32]  Hua-Fu Li Mining top-k maximal reference sequences from streaming web click-sequences with a damped sliding window , 2009, Expert Syst. Appl..

[33]  Keun Ho Ryu,et al.  High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates , 2014, Expert Syst. Appl..

[34]  Tzung-Pei Hong,et al.  A Projection-Based Approach for Discovering High Average-Utility Itemsets , 2012, J. Inf. Sci. Eng..

[35]  Won Suk Lee,et al.  Finding recently frequent itemsets adaptively over online transactional data streams, , 2006, Inf. Syst..

[36]  Heungmo Ryang,et al.  Approximate Maximal Frequent Pattern Mining with Weight Conditions and Error Tolerance , 2016, Int. J. Pattern Recognit. Artif. Intell..

[37]  Philip S. Yu,et al.  Efficient Algorithms for Mining Top-K High Utility Itemsets , 2016, IEEE Transactions on Knowledge and Data Engineering.

[38]  Benjamin C. M. Fung,et al.  Mining High Utility Patterns in One Phase without Generating Candidates , 2016, IEEE Transactions on Knowledge and Data Engineering.

[39]  Ying Liu,et al.  A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets , 2005, PAKDD.

[40]  Suh-Yin Lee,et al.  DSM-FI: an efficient algorithm for mining frequent itemsets in data streams , 2008, Knowledge and Information Systems.

[41]  Philip S. Yu,et al.  Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[42]  Bay Vo,et al.  A lattice-based approach for mining high utility association rules , 2017, Inf. Sci..

[43]  Heungmo Ryang,et al.  Incremental high utility pattern mining with static and dynamic databases , 2014, Applied Intelligence.

[44]  Tzung-Pei Hong,et al.  An Incremental Mining Algorithm for High Average-Utility Itemsets , 2009, 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks.

[45]  Tzung-Pei Hong,et al.  Effective utility mining with the measure of average utility , 2011, Expert Syst. Appl..

[46]  Heungmo Ryang,et al.  An uncertainty-based approach: Frequent itemset mining from uncertain data with different item importance , 2015, Knowl. Based Syst..

[47]  Carson Kai-Sang Leung,et al.  Frequent itemset mining of uncertain data streams using the damped window model , 2011, SAC.

[48]  Tzung-Pei Hong,et al.  A New Method for Mining High Average Utility Itemsets , 2014, CISIM.

[49]  Keun Ho Ryu,et al.  Fast algorithm for high utility pattern mining with the sum of item quantities , 2016, Intell. Data Anal..

[50]  Matti Nykänen,et al.  Efficient Discovery of Statistically Significant Association Rules , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[51]  Justin Zhijun Zhan,et al.  Fast algorithms for hiding sensitive high-utility itemsets in privacy-preserving utility mining , 2016, Eng. Appl. Artif. Intell..

[52]  Hamido Fujita,et al.  An efficient algorithm for mining high utility patterns from incremental databases with one database scan , 2017, Knowl. Based Syst..

[53]  Heungmo Ryang,et al.  Mining recent high average utility patterns based on sliding window from stream data , 2016, J. Intell. Fuzzy Syst..

[54]  Young-Koo Lee,et al.  Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases , 2009, IEEE Transactions on Knowledge and Data Engineering.

[55]  Aristides Gionis,et al.  Assessing data mining results via swap randomization , 2007, TKDD.

[56]  Vincent S. Tseng,et al.  FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning , 2014, ISMIS.

[57]  Tzung-Pei Hong,et al.  An efficient projection-based indexing approach for mining high utility itemsets , 2012, Knowledge and Information Systems.