Updating the Built Prelarge Fast Updated Sequential Pattern Trees with Sequence Modification

Mining useful information or knowledge from a very large database to aid managers or decision makers to make appropriate decisions is a critical issue in recent years. Sequential patterns can be used to discover the purchased behaviors of customers or the usage behaviors of users from Web log data. Most approaches process a static database to discover sequential patterns in a batch way. In real-world applications, transactions or sequences in databases are frequently changed. In the past, a fast updated sequential pattern (FUSP)-tree was proposed to handle dynamic databases whether for sequence insertion, deletion or modification based on FUP concepts. Original database is required to be re-scanned if it is necessary to maintain the small sequences which was not kept in the FUSP tree. In this paper, the prelarge concept was adopted to maintain and update the built prelarge FUSP tree for sequence modification. A prelarge FUSP tree is modified from FUSP tree for preserving not only the frequent 1-sequences but also the prelarge 1-sequences in the tree structure. The PRELARGE-FUSP-TREE-MOD maintenance algorithm is proposed to reduce the rescans of the original database due to the pruning properties of prelarge concept. When the number of modified sequences is smaller than the safety bound of the prelarge concept, better results can be obtained by the proposed PRELARGE-FUSP-TREE-MOD maintenance algorithm for sequence modification in dynamic databases.

[1]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[2]  Tzung-Pei Hong,et al.  A new incremental data mining algorithm using pre-large itemsets , 2001, Intell. Data Anal..

[3]  Marvin L. Brown,et al.  The impact of missing data on data mining , 2003 .

[4]  Philip S. Yu,et al.  Mining Cluster-Based Temporal Mobile Sequential Patterns in Location-Based Service Environments , 2011, IEEE Transactions on Knowledge and Data Engineering.

[5]  Suh-Yin Lee,et al.  Incremental update on sequential patterns in large databases , 1998, Proceedings Tenth IEEE International Conference on Tools with Artificial Intelligence (Cat. No.98CH36294).

[6]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[7]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[8]  Tzung-Pei Hong,et al.  An FUSP-Tree Maintenance Algorithm for Record Modification , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[9]  Jiawei Han,et al.  Data Mining for Web Intelligence , 2002, Computer.

[10]  David Taniar,et al.  High Performance Parallel Database Processing and Grid Databases , 2008 .

[11]  Tzung-Pei Hong,et al.  An Incremental FUSP-Tree Maintenance Algorithm , 2008, 2008 Eighth International Conference on Intelligent Systems Design and Applications.

[12]  Igor Kononenko,et al.  Influence of Domain and Model Properties on the Reliability Estimates' Performance , 2009, Int. J. Data Warehous. Min..

[13]  Jyh-Shing Roger Jang,et al.  Discovering Time-Constrained Sequential Patterns for Music Genre Classification , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Jessica Koehler Xml Data Mining Models Methods And Applications , 2016 .

[15]  Tzung-Pei Hong,et al.  An incremental mining algorithm for maintaining sequential patterns using pre-large sequences , 2011, Expert Syst. Appl..

[16]  David Taniar,et al.  ODAM: An optimized distributed association rule mining algorithm , 2004, IEEE Distributed Systems Online.

[17]  David Taniar,et al.  High-Performance Parallel Database Processing and Grid Databases: Taniar/High-Performance Parallel DP & Grid DB , 2008 .

[18]  David Taniar,et al.  Exception rules in association rule mining , 2008, Appl. Math. Comput..

[19]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[20]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[21]  DehneFrank,et al.  Parallel Real-Time OLAP on Multi-Core Processors , 2015 .

[22]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[23]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[24]  Ron Kohavi,et al.  Real world performance of association rule algorithms , 2001, KDD '01.

[25]  Tzung-Pei Hong,et al.  An Efficient FUSP-Tree Update Algorithm for Deleted Data in Customer Sequences , 2009, 2009 Fourth International Conference on Innovative Computing, Information and Control (ICICIC).

[26]  Jiawei Han,et al.  IncSpan: incremental mining of sequential patterns in large database , 2004, KDD.

[27]  Rosa Meo,et al.  Geographical map annotation with significant tags available from social networks , 2011 .

[28]  Tzung-Pei Hong,et al.  Maintenance of sequential patterns for record modification using pre-large sequences , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[29]  Ina Fourie Social and Political Implications of Data Mining: Knowledge Management in E‐Government , 2010 .

[30]  Sebastián Lozano,et al.  Parallel Fuzzy c-Means Clustering for Large Data Sets , 2002, Euro-Par.

[31]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[32]  Tzung-Pei Hong,et al.  Maintenance of sequential patterns for record deletion , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[33]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[34]  Jianyong Wang,et al.  Mining sequential patterns by pattern-growth: the PrefixSpan approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[35]  David Taniar,et al.  Exception Rules Mining Based on Negative Association Rules , 2004, ICCSA.

[36]  Sankar K. Pal,et al.  Data mining in soft computing framework: a survey , 2002, IEEE Trans. Neural Networks.

[37]  Qinbao Song,et al.  TripRec: An Efficient Approach for Trip Planning with Time Constraints , 2015, Int. J. Data Warehous. Min..