Mining conditional discriminative sequential patterns

Abstract Discriminative sequential pattern mining is one of the most important topics in pattern mining, which has a very wide range of applications. Discriminative sequential pattern mining is intended to extract sequential patterns with significant differences among different classes. In recent years, a variety of algorithms for mining discriminative sequential patterns have been proposed, but these algorithms still suffer from generating many redundant patterns. There are many factors that may lead to the redundancy of reported patterns, among which the subset-induced redundancy is the most critical one, i.e., some patterns are reported to be discriminative mainly because some of their sub-patterns are strongly discriminative. In order to solve the subset-induced redundancy issue, we propose the concept of conditional discriminative sequential pattern, and design a new algorithm called CDSPM (Conditional Discriminative Sequential Pattern Mining) for extracting such kinds of patterns. The experimental results on real data sets show that CDSPM can effectively remove discriminative sequential patterns that are redundant with respect to their sub-patterns.

[1]  Philip S. Yu,et al.  Direct Discriminative Pattern Mining for Effective Classification , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[2]  Milos Hauskrecht,et al.  An efficient pattern mining approach for event detection in multivariate temporal data , 2015, Knowledge and Information Systems.

[3]  Guizhen Yang,et al.  Computational aspects of mining maximal frequent patterns , 2006, Theor. Comput. Sci..

[4]  Umeshwar Dayal,et al.  FreeSpan: frequent pattern-projected sequential pattern mining , 2000, KDD '00.

[5]  Darrell Conklin,et al.  Pattern and Antipattern Discovery in Ethiopian Bagana Songs , 2016, Computational Music Analysis.

[6]  James Bailey,et al.  Mining Minimal Distinguishing Subsequence Patterns with Gap Constraints , 2005, ICDM.

[7]  Jie Wang,et al.  Discriminative pattern mining and its applications in bioinformatics , 2015, Briefings Bioinform..

[8]  Shu Wang,et al.  Mining intricate temporal rules for recognizing complex activities of daily living under uncertainty , 2016, Pattern Recognit..

[9]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[10]  Ju Wang,et al.  Conditional discriminative pattern mining: Concepts and algorithms , 2017, Inf. Sci..

[11]  Tom M. Mitchell,et al.  Learning to construct knowledge bases from the World Wide Web , 2000, Artif. Intell..

[12]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[13]  Marc Boullé,et al.  A user parameter-free approach for mining robust sequential classification rules , 2017, Knowledge and Information Systems.

[14]  Yi-Cheng Chen,et al.  On efficiently mining high utility sequential patterns , 2016, Knowledge and Information Systems.

[15]  Hong-Han Shuai,et al.  Distributed and scalable sequential pattern mining through stream processing , 2017, Knowledge and Information Systems.

[16]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[17]  Shu Wang,et al.  A framework of mining semantic-based probabilistic event relations for complex activity recognition , 2017, Inf. Sci..

[18]  Kotagiri Ramamohanarao,et al.  Septic shock prediction for ICU patients via coupled HMM walking on sequential contrast patterns , 2017, J. Biomed. Informatics.

[19]  Osmar R. Zaïane,et al.  An Occurrence Based Approach to Mine Emerging Sequences , 2010, DaWak.

[20]  Jian Wang,et al.  Sequential pattern mining in databases with temporal uncertainty , 2017, Knowledge and Information Systems.

[21]  Chi Lap Yip,et al.  Mining emerging substrings , 2003, Eighth International Conference on Database Systems for Advanced Applications, 2003. (DASFAA 2003). Proceedings..

[22]  Lu Liu,et al.  Mining Distinguishing Customer Focus Sets for Online Shopping Decision Support , 2016, ADMA.

[23]  Hung T. Nguyen,et al.  Hypotension Risk Prediction via Sequential Contrast Patterns of ICU Blood Pressure , 2016, IEEE Journal of Biomedical and Health Informatics.

[24]  Wei Cao,et al.  An effective contrast sequential pattern mining approach to taxpayer behavior analysis , 2015, World Wide Web.

[25]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[26]  Qiming Chen,et al.  PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth , 2001, Proceedings 17th International Conference on Data Engineering.

[27]  P. S. Sastry,et al.  Conditional Probability-Based Significance Tests for Sequential Patterns in Multineuronal Spike Trains , 2008, Neural Computation.

[28]  Jiawei Han,et al.  Discriminative Frequent Pattern Analysis for Effective Classification , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[29]  Osmar R. Zaïane,et al.  Contrasting Sequence Groups by Emerging Sequences , 2009, Discovery Science.

[30]  Changjie Tang,et al.  Efficient Mining of Density-Aware Distinguishing Sequential Patterns with Gap Constraints , 2014, DASFAA.

[31]  Zhenglu Yang,et al.  LAPIN: Effective Sequential Pattern Mining Algorithms by Last Position Induction for Dense Databases , 2007, DASFAA.

[32]  Dmitriy Fradkin,et al.  Under Consideration for Publication in Knowledge and Information Systems Mining Sequential Patterns for Classification , 2022 .

[33]  Jiawei Han,et al.  TSP: Mining top-k closed sequential patterns , 2004, Knowledge and Information Systems.