An efficient algorithm for Hiding High Utility Sequential Patterns

Abstract High Utility Sequential Patterns (HUSP) are a type of patterns that can be found in data collected in many domains such as business, marketing and retail. Two critical topics related to HUSP are: HUSP mining (HUSPM) and HUSP Hiding (HUSPH). HUSPM algorithms are designed to discover all sequential patterns that have a utility greater than or equal to a minimum utility threshold in a sequence database. HUSPH algorithms, by contrast, conceal all HUSP so that competitors cannot find them in shared databases. This paper focuses on HUSPH. It proposes an algorithm named HUS-Hiding to efficiently hide all HUSP. An extensive experimental evaluation is conducted on six real-life datasets to evaluate the performance of the proposed algorithm. According to the experimental results, the designed algorithm is more effective than three state-of-the-art algorithms in terms of runtime, memory usage and hiding accuracy.

[1]  Junfu Yin,et al.  Mining high utility sequential patterns , 2015 .

[2]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[3]  Longbing Cao,et al.  USpan: an efficient algorithm for mining high utility sequential patterns , 2012, KDD.

[4]  Byeong-Soo Jeong,et al.  A Novel Approach for Mining High‐Utility Sequential Patterns in Sequence Databases , 2010 .

[5]  Elisa Bertino,et al.  A Survey of Quantification of Privacy Preserving Data Mining Algorithms , 2008, Privacy-Preserving Data Mining.

[6]  Tzung-Pei Hong,et al.  Applying the maximum utility measure in high utility sequential pattern mining , 2014, Expert Syst. Appl..

[7]  Tzung-Pei Hong,et al.  Efficiently Hiding Sensitive Itemsets with Transaction Deletion Based on Genetic Algorithms , 2014, TheScientificWorldJournal.

[8]  Jieh-Shan Yeh,et al.  HHUIF and MSICF: Novel algorithms for privacy preserving utility mining , 2010, Expert Syst. Appl..

[9]  Chris Clifton,et al.  Using unknowns to prevent discovery of association rules , 2001, SGMD.

[10]  Elisa Bertino,et al.  State-of-the-art in privacy preserving data mining , 2004, SGMD.

[11]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[12]  Aris Gkoulalas-Divanis,et al.  Utility-preserving transaction data anonymization with low information loss , 2012, Expert Syst. Appl..

[13]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[14]  Hoai Bac Le,et al.  A Novel Approach for Hiding High Utility Sequential Patterns , 2015, SoICT.

[15]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[16]  Jieh-Shan Yeh,et al.  Novel Algorithms for Privacy Preserving Utility Mining , 2008, 2008 Eighth International Conference on Intelligent Systems Design and Applications.

[17]  Yi-Cheng Chen,et al.  On efficiently mining high utility sequential patterns , 2016, Knowledge and Information Systems.

[18]  Milan Petkovic,et al.  Security, Privacy, and Trust in Modern Data Management , 2007, Data-Centric Systems and Applications.

[19]  Vinod Kumar Yadav,et al.  An Efficient Association Rule Hiding Algorithm for Privacy Preserving Data Mining , 2011 .

[20]  Vincent S. Tseng,et al.  A One-Phase Method for Mining High Utility Mobile Sequential Patterns in Mobile Commerce Environments , 2012, IEA/AIE.

[21]  Hoai Bac Le,et al.  An Approach to Decrease Execution Time and Difference for Hiding High Utility Sequential Patterns , 2016, IUKM.

[22]  Antonio Gomariz,et al.  SPMF: a Java open-source pattern mining library , 2014, J. Mach. Learn. Res..

[23]  Justin Zhijun Zhan,et al.  Fast algorithms for hiding sensitive high-utility itemsets in privacy-preserving utility mining , 2016, Eng. Appl. Artif. Intell..

[24]  Philip S. Yu,et al.  A General Survey of Privacy-Preserving Data Mining Models and Algorithms , 2008, Privacy-Preserving Data Mining.

[25]  Robert Gwadera,et al.  Permutation-Based Sequential Pattern Hiding , 2013, 2013 IEEE 13th International Conference on Data Mining.

[26]  Hoai Bac Le,et al.  MHHUSP: An integrated algorithm for mining and Hiding High Utility Sequential Patterns , 2016, 2016 Eighth International Conference on Knowledge and Systems Engineering (KSE).

[27]  Aris Gkoulalas-Divanis,et al.  Revisiting sequential pattern hiding to enhance utility , 2011, KDD.

[28]  Tzung-Pei Hong,et al.  An Efficient Method for Hiding High Utility Itemsets , 2013, KES-AMSTA.

[29]  Qiming Chen,et al.  PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth , 2001, Proceedings 17th International Conference on Data Engineering.

[30]  Van-Nam Huynh,et al.  Mining Periodic High Utility Sequential Patterns , 2017, ACIIDS.

[31]  Aris Gkoulalas-Divanis,et al.  Hiding Sensitive Patterns from Sequence Databases: Research Challenges and Solutions , 2013, 2013 IEEE 14th International Conference on Mobile Data Management.