Behavioral Constraint Template-Based Sequence Classification

In this paper we present the interesting Behavioral Constraint Miner (iBCM), a new approach towards classifying sequences. The prevalence of sequential data, i.e., a collection of ordered items such as text, website navigation patterns, traffic management, and so on, has incited a surge in research interest towards sequence classification. Existing approaches mainly focus on retrieving sequences of itemsets and checking their presence in labeled data streams to obtain a classifier. The proposed iBCM approach, rather than focusing on plain sequences, is template-based and draws its inspiration from behavioral patterns used for software verification. These patterns have a broad range of characteristics and go beyond the typical sequence mining representation, allowing for a more precise and concise way of capturing sequential information in a database. Furthermore, it is possible to also mine for negative information, i.e., sequences that do not occur. The technique is benchmarked against other state-of-the-art approaches and exhibits a strong potential towards sequence classification. Code related to this chapter is available at: http://feb.kuleuven.be/public/u0092789/.

[1]  Stefano Ferilli,et al.  Multi-Dimensional Relational Sequence Mining , 2008, Fundam. Informaticae.

[2]  Jan Mendling,et al.  Efficient discovery of Target-Branched Declare constraints , 2016, Inf. Syst..

[3]  M Maja Pesic,et al.  Constraint-based workflow management systems : shifting control to users , 2008 .

[4]  George S. Avrunin,et al.  Patterns in property specifications for finite-state verification , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[5]  Charles A. Sutton,et al.  A Subsequence Interleaving Model for Sequential Pattern Mining , 2016, KDD.

[6]  Umeshwar Dayal,et al.  PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth , 2001, ICDE 2001.

[7]  Frank Klawonn,et al.  Sequence Mining for Customer Behaviour Predictions in Telecommunications , 2006 .

[8]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[9]  Hajo A. Reijers,et al.  UnconstrainedMiner: Efficient Discovery of Generalized Declarative Process Models , 2013 .

[10]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[11]  Boris Cule,et al.  Pattern Based Sequence Classification , 2016, IEEE Transactions on Knowledge and Data Engineering.

[12]  Marc Boullé,et al.  A Parameter-Free Approach for Mining Robust Sequential Classification Rules , 2015, 2015 IEEE International Conference on Data Mining.

[13]  Hannu Toivonen,et al.  Discovery of frequent DATALOG patterns , 1999, Data Mining and Knowledge Discovery.

[14]  Wil M. P. van der Aalst,et al.  Efficient Discovery of Understandable Declarative Process Models from Event Logs , 2012, CAiSE.

[15]  Patrice Boizumault,et al.  PREFIX-PROJECTION Global Constraint for Sequential Pattern Mining , 2015, CP.

[16]  Mohammed J. Zaki Sequence mining in categorical domains: incorporating constraints , 2000, CIKM '00.

[17]  Marco Montali,et al.  Discovering Data-Aware Declarative Process Models from Event Logs , 2013, BPM.

[18]  Jiawei Han,et al.  Discriminative Frequent Pattern Analysis for Effective Classification , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[19]  Dmitriy Fradkin,et al.  Under Consideration for Publication in Knowledge and Information Systems Mining Sequential Patterns for Classification , 2022 .

[20]  Tias Guns,et al.  An Efficient Algorithm for Mining Frequent Sequence with Constraint Programming , 2016, ECML/PKDD.

[21]  Tias Guns,et al.  Constraint-Based Sequence Mining Using Constraint Programming , 2015, CPAIOR.

[22]  Patrice Boizumault,et al.  A Global Constraint for Mining Sequential Patterns with GAP Constraint , 2016, CPAIOR.

[23]  Jae-Gil Lee,et al.  Mining Discriminative Patterns for Classifying Trajectories on Road Networks , 2011, IEEE Transactions on Knowledge and Data Engineering.

[24]  Wil M. P. van der Aalst,et al.  DECLARE: Full Support for Loosely-Structured Processes , 2007, 11th IEEE International Enterprise Distributed Object Computing Conference (EDOC 2007).

[25]  Emmanuel Coquery,et al.  A SAT-Based Approach for Discovering Frequent, Closed and Maximal Patterns in a Sequence , 2012, ECAI.

[26]  Boris Cule,et al.  Mining Association Rules in Long Sequences , 2010, PAKDD.

[27]  Jiawei Han,et al.  BIDE: efficient mining of frequent closed sequences , 2004, Proceedings. 20th International Conference on Data Engineering.

[28]  Massimo Mecella,et al.  A two-step fast algorithm for the automated discovery of declarative workflows , 2013, 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).