Approximate high utility itemset mining in noisy environments

Abstract High utility pattern mining has been proposed to overcome the limitations of frequent pattern mining which cannot reflect the unique profits of items. High utility pattern mining has been actively conducted because it can find more valuable patterns than previous fields of pattern mining. However, its traditional approaches are designed to perform on the assumption that the data stored in databases is faultless. If there are unknown errors, such as noises, in a given database, the mining results traditional high utility pattern mining approaches mined in this database cannot be fully trusted. In this paper, a novel technique considering the noises is suggested in order to overcome this limitation. The proposed technique calculates the ranges of trustworthy utilities for patterns using a utility tolerance factor. By using this factor, the robust high utility patterns, called as approximate high utility patterns, can be extracted from a noisy database. To evaluate the performance of the proposed algorithm, various experiments are designed and conducted in terms of runtime, memory usage, and scalability. The experimental results show that the proposed algorithm outperforms than competitors, an apriori-based approach and UP-Growth.

[1]  Philip S. Yu,et al.  A Survey of Parallel Sequential Pattern Mining , 2018, ACM Trans. Knowl. Discov. Data.

[2]  Benjamin C. M. Fung,et al.  Opportunistic mining of top-n high utility patterns , 2018, Inf. Sci..

[3]  Chung Keung Poon,et al.  On mining approximate and exact fault-tolerant frequent itemsets , 2018, Knowledge and Information Systems.

[4]  Jinliang Shao,et al.  An Analysis on Optimal Attack Schedule Based on Channel Hopping Scheme in Cyber-Physical Systems , 2019, IEEE Transactions on Cybernetics.

[5]  Durga Toshniwal,et al.  Frequent Pattern Mining on Time and Location Aware Air Quality Data , 2019, IEEE Access.

[6]  Reza Pulungan,et al.  A BPSO-based method for high-utility itemset mining without minimum utility threshold , 2020, Knowl. Based Syst..

[7]  Philip S. Yu,et al.  A Survey of Utility-Oriented Pattern Mining , 2018, IEEE Transactions on Knowledge and Data Engineering.

[8]  Unil Yun,et al.  Mining high utility itemsets based on the time decaying model , 2016, Intell. Data Anal..

[9]  Zhi-Hong Deng,et al.  An efficient structure for fast mining high utility itemsets , 2018, Applied Intelligence.

[10]  Stefan Decker,et al.  Mining maximal frequent patterns in transactional databases and dynamic data streams: A spark-based approach , 2018, Inf. Sci..

[11]  Philippe Fournier-Viger,et al.  Extracting useful knowledge from event logs: A frequent itemset mining approach , 2018, Knowl. Based Syst..

[12]  Unil Yun,et al.  A new efficient approach for mining uncertain frequent patterns using minimum data structure without false positives , 2017, Future Gener. Comput. Syst..

[13]  Unil Yun,et al.  Efficient transaction deleting approach of pre-large based high utility pattern mining in dynamic databases , 2020, Future Gener. Comput. Syst..

[14]  Lu Yang,et al.  Mining of skyline patterns by considering both frequent and utility constraints , 2019, Eng. Appl. Artif. Intell..

[15]  Cheong Hee Park,et al.  Emerging topic detection in twitter stream based on high utility pattern mining , 2019, Expert Syst. Appl..

[16]  Hamido Fujita,et al.  An efficient method for mining high utility closed itemsets , 2019, Inf. Sci..

[17]  Carson Kai-Sang Leung,et al.  Mining weighted frequent sequences in uncertain databases , 2019, Inf. Sci..

[18]  Yun Sing Koh,et al.  Mining local and peak high utility itemsets , 2019, Inf. Sci..

[19]  Unil Yun,et al.  Efficient High Utility Pattern Mining for Establishing Manufacturing Plans With Sliding Window Control , 2017, IEEE Transactions on Industrial Electronics.

[20]  Philip S. Yu,et al.  Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[21]  Caiquan Xiong,et al.  Frequent Patterns Mining in DNA Sequence , 2019, IEEE Access.

[22]  Tzung-Pei Hong,et al.  Efficient Algorithm for Mining Non-Redundant High-Utility Association Rules , 2020, Sensors.

[23]  Djamel Djenouri,et al.  Frequent Itemset Mining in Big Data With Effective Single Scan Algorithms , 2018, IEEE Access.

[24]  Unil Yun,et al.  Analyzing of incremental high utility pattern mining based on tree structures , 2017, Human-centric Computing and Information Sciences.

[25]  Heungmo Ryang,et al.  Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques , 2017, Knowledge and Information Systems.

[26]  Takayuki Yamada,et al.  Data mining based on clustering and association rule analysis for knowledge discovery in multiobjective topology optimization , 2019, Expert Syst. Appl..

[27]  Unil Yun,et al.  Efficient approach for incremental high utility pattern mining with indexed list structure , 2019, Future Gener. Comput. Syst..

[28]  Rinkle Rani,et al.  Ap-FSM: A parallel algorithm for approximate frequent subgraph mining using Pregel , 2018, Expert Syst. Appl..

[29]  Keun Ho Ryu,et al.  Approximate weighted frequent pattern mining with/without noisy environments , 2011, Knowl. Based Syst..

[30]  Unil Yun,et al.  An Efficient Approach for Mining Weighted Approximate Closed Frequent Patterns Considering Noise Constraints , 2014, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[31]  Tzung-Pei Hong,et al.  Efficient algorithms for mining high-utility itemsets in uncertain databases , 2016, Knowl. Based Syst..

[32]  Hanieh Fasihy,et al.  Incremental mining maximal frequent patterns from univariate uncertain data , 2018, Knowl. Based Syst..

[33]  Srikumar Krishnamoorthy,et al.  Pruning strategies for mining high utility itemsets , 2015, Expert Syst. Appl..

[34]  Yang Chen,et al.  Association rule mining based parameter adaptive strategy for differential evolution algorithms , 2019, Expert Syst. Appl..

[35]  Philip S. Yu,et al.  UP-Growth: an efficient algorithm for high utility itemset mining , 2010, KDD.

[36]  Yongguo Liu,et al.  An efficient frequent pattern mining algorithm using a highly compressed prefix tree , 2019, Intell. Data Anal..

[37]  Unil Yun,et al.  Efficient algorithm for mining high average-utility itemsets in incremental transaction databases , 2017, Applied Intelligence.

[38]  Unil Yun,et al.  Single-pass based efficient erasable pattern mining using list data structure on dynamic incremental databases , 2018, Future Gener. Comput. Syst..

[39]  Vincent S. Tseng,et al.  EFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining , 2015, MICAI.

[40]  Ickjai Lee,et al.  Mining distinct and contiguous sequential patterns from large vehicle trajectories , 2020, Knowl. Based Syst..

[41]  Heungmo Ryang,et al.  Approximate Maximal Frequent Pattern Mining with Weight Conditions and Error Tolerance , 2016, Int. J. Pattern Recognit. Artif. Intell..

[42]  Vincent S. Tseng,et al.  FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning , 2014, ISMIS.

[43]  Yang Wang,et al.  Mining High Utility Itemsets over Uncertain Databases , 2015, 2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery.

[44]  Manuel Mucientes,et al.  Mining Frequent Patterns in Process Models , 2017, Inf. Sci..

[45]  Hamido Fujita,et al.  Damped window based high average utility pattern mining over data streams , 2017, Knowl. Based Syst..

[46]  Rui Sun,et al.  A Survey of Key Technologies for High Utility Patterns Mining , 2020, IEEE Access.

[47]  Philippe Fournier-Viger,et al.  ETARM: an efficient top-k association rule mining algorithm , 2017, Applied Intelligence.

[48]  Keun Ho Ryu,et al.  Weighted approximate sequential pattern mining within tolerance factors , 2011, Intell. Data Anal..