HUE-Span: Fast High Utility Episode Mining

High utility episode mining consists of finding episodes (sub-sequences of events) that have a high importance (e.g high profit) in a sequence of events with quantities and weights. Though it has important real-life applications, the current problem definition has two critical limitations. First, it underestimates the utility of episodes by not taking into account all timestamps of minimal occurrences for utility calculations, which can result in missing high utility episodes. Second, the state-of-the-art UP-Span algorithm is inefficient on large databases because it uses a loose upper bound on the utility to reduce the search space. This paper addresses the first issue by redefining the problem to guarantee that all high utility episodes are found. Moreover, an efficient algorithm named HUE-Span is proposed to efficiently find all patterns. It relies on a novel upper-bound to reduce the search space and a novel co-occurrence based pruning strategy. Experimental results show that HUE-Span not only finds all patterns but is also up to five times faster than UP-Span.

[1]  Enhong Chen,et al.  High Utility Episode Mining Made Practical and Fast , 2014, ADMA.

[2]  Vincent S. Tseng,et al.  A novel methodology for stock investment using high utility episode mining and genetic algorithm , 2017, Appl. Soft Comput..

[3]  Byeong-Soo Jeong,et al.  A Framework for Mining High Utility Web Access Sequences , 2011 .

[4]  Vincent S. Tseng,et al.  FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning , 2014, ISMIS.

[5]  Dhaval Patel,et al.  Top-K High Utility Episode Mining from a Complex Event Sequence , 2016, COMAD.

[6]  Fuzhen Zhuang,et al.  Online Frequent Episode Mining , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[7]  Chia-Hui Chang,et al.  Efficient mining of frequent episodes from complex sequences , 2008, Inf. Syst..

[8]  Yun Sing Koh,et al.  Mining local and peak high utility itemsets , 2019, Inf. Sci..

[9]  Lina Fahed,et al.  DEER: Distant and Essential Episode Rules for early prediction , 2018, Expert Syst. Appl..

[10]  Raffaela Mirandola,et al.  An online learning model based on episode mining for workload prediction in cloud , 2018, Future Gener. Comput. Syst..

[11]  A. Akhmetova Discovery of Frequent Episodes in Event Sequences , 2006 .

[12]  Avinash Achar,et al.  A unified view of the apriori-based algorithms for frequent episode discovery , 2011, Knowledge and Information Systems.

[13]  Philip S. Yu,et al.  Mining high utility episodes in complex event sequences , 2013, KDD.

[14]  Jerry Chun-Wei Lin,et al.  A Survey of High Utility Itemset Mining , 2019, Studies in Big Data.

[15]  Mengchi Liu,et al.  Mining high utility itemsets without candidate generation , 2012, CIKM.

[16]  Fuzhen Zhuang,et al.  Mining Precise-Positioning Episode Rules from Event Sequences , 2017, IEEE Transactions on Knowledge and Data Engineering.

[17]  Xiang Li,et al.  Interactive Discovery of Statistically Significant Itemsets , 2018, IEA/AIE.