Mining full, inner and tail periodic patterns with perfect, imperfect and asynchronous periodicity simultaneously

Periodic pattern has been utilized in many real life applications, such as weather conditions in a particular season, transactions in a superstore, power consumption, computer network fault analysis, and analysis of DNA and protein sequences. Periodic pattern mining is a popular though challenging research field in data mining because periodic patterns are of different types (namely full, inner, and tail patterns) and varied periodicities (namely perfect, imperfect, and asynchronous periodicity). Previous periodic pattern mining methods have some disadvantages: (1) Previous methods have to find different patterns separately; (2) They require postprocessing such as level-by-level join strategies for mining complex periodic patterns which have wildcards between two items. They cannot mine full, tail, and inner periodic patterns with perfect, imperfect, and asynchronous periodicities simultaneously. Therefore, an effective and comprehensive approach capable of discovering the above specified kinds of periodic patterns is needed. We propose a novel suffix tree-based algorithm, Mining dIfferent kinds of Periodic Patterns Simultaneously, MIPPS, to address the above issues. MIPPS finds different kinds of periodic patterns with different periodicities simultaneously without level-by-level join techniques using a novel incremental propagation generator. In addition, MIPPS mines periodic patterns efficiently using some pruning strategies. For the performance evaluation, we use both synthetic and real data to confirm good performance and scalability with complex periodic patterns.

[1]  Walid G. Aref,et al.  Multiple and Partial Periodicity Mining in Time Series Databases , 2002, ECAI.

[2]  Walid G. Aref,et al.  WARP: time warping for periodicity detection , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[3]  Tak-Chung Fu,et al.  A review on time series data mining , 2011, Eng. Appl. Artif. Intell..

[4]  Andreas S. Weigend,et al.  Time Series Prediction: Forecasting the Future and Understanding the Past , 1994 .

[5]  Chia-Hui Chang,et al.  SMCA: a general model for mining asynchronous periodic patterns in temporal databases , 2005, IEEE Transactions on Knowledge and Data Engineering.

[6]  William F. Smyth,et al.  Computing Patterns in Strings , 2003 .

[7]  Walid G. Aref,et al.  Periodicity detection in time series databases , 2005, IEEE Transactions on Knowledge and Data Engineering.

[8]  Masaru Kitsuregawa,et al.  Discovering Partial Periodic Itemsets in Temporal Databases , 2017, SSDBM.

[9]  Chenghu Zhou,et al.  Similarity search and pattern discovery in hydrological time series data mining , 2010 .

[10]  Sridhar Ramaswamy,et al.  Cyclic association rules , 1998, Proceedings 14th International Conference on Data Engineering.

[11]  Reda Alhajj,et al.  Periodicity data mining in time series using Suffix Arrays , 2012, 2012 6th IEEE International Conference Intelligent Systems.

[12]  Nikos Mamoulis,et al.  Discovering Partial Periodic Patterns in Discrete Data Sequences , 2004, PAKDD.

[13]  Nunzio D'Agostino,et al.  ParPEST: a pipeline for EST data analysis based on parallel computing , 2005, BMC Bioinformatics.

[14]  Mohammed Al-Shalalfa,et al.  Efficient Periodicity Mining in Time Series Databases Using Suffix Trees , 2011, IEEE Transactions on Knowledge and Data Engineering.

[15]  Vincent Mwintieru Nofong,et al.  Towards fast and memory efficient discovery of periodic frequent patterns , 2019, J. Inf. Telecommun..

[16]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[17]  Manziba Akanda Nishi,et al.  Effective periodic pattern mining in time series databases , 2013, Expert Syst. Appl..

[18]  Haiyan Song,et al.  Tourism demand modelling and forecasting—A review of recent research , 2008 .

[19]  Dong Zhou,et al.  Translation techniques in cross-language information retrieval , 2012, CSUR.

[20]  Jure Leskovec,et al.  Modeling Individual Cyclic Variation in Human Behavior , 2017, WWW.

[21]  Taghi M. Khoshgoftaar,et al.  CLUSTERING-BASED NETWORK INTRUSION DETECTION , 2007 .

[22]  Jie Chen,et al.  Bioinformatics Original Paper Detecting Periodic Patterns in Unevenly Spaced Gene Expression Time Series Using Lomb–scargle Periodograms , 2022 .

[23]  Boonserm Kijsirikul,et al.  Advances in Knowledge Discovery and Data Mining, 13th Pacific-Asia Conference, PAKDD 2009, Bangkok, Thailand, April 27-30, 2009, Proceedings , 2009, PAKDD.

[24]  R. Alhajj,et al.  Using suffix trees for periodicity detection in time series databases , 2008, 2008 4th International IEEE Conference Intelligent Systems.

[25]  Galit Shmueli,et al.  Automated time series forecasting for biosurveillance , 2007, Statistics in medicine.

[26]  Alan C. H. Ling,et al.  Mining partial periodic correlations in time series , 2006, Knowledge and Information Systems.

[27]  Ziv Bar-Joseph,et al.  Alignment and classification of time series gene expression in clinical studies , 2008, ISMB.

[28]  Jiawei Han,et al.  ePeriodicity: Mining Event Periodicity from Incomplete Observations , 2015, IEEE Transactions on Knowledge and Data Engineering.

[29]  Masaru Kitsuregawa,et al.  Discovering Recurring Patterns in Time Series , 2015, EDBT.

[30]  Mong-Li Lee,et al.  Mining Dense Periodic Patterns in Time Series Data , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[31]  Carson K. Leung,et al.  A new framework for mining weighted periodic patterns in time series databases , 2017, Expert Syst. Appl..

[32]  M. V. Katti,et al.  Amino acid repeat patterns in protein sequences: Their diversity and structural‐functional implications , 2000, Protein science : a publication of the Protein Society.

[33]  Young-Koo Lee,et al.  Discovering Periodic-Frequent Patterns in Transactional Databases , 2009, PAKDD.

[34]  Hua Yuan,et al.  Efficient Mining of Event Periodicity in Data Series , 2019, DASFAA.

[35]  Philip S. Yu,et al.  A Periodicity-based Parallel Time Series Prediction Algorithm in Cloud Computing Environments , 2018, Inf. Sci..

[36]  Masaru Kitsuregawa,et al.  Novel Techniques to Reduce Search Space in Periodic-Frequent Pattern Mining , 2014, DASFAA.

[37]  Reda Alhajj,et al.  STNR: A suffix tree based noise resilient algorithm for periodicity detection in time series databases , 2010, Applied Intelligence.

[38]  Jun Yang,et al.  Database Systems for Advanced Applications , 2019, Lecture Notes in Computer Science.

[39]  G.M.Karthik Constraint Based Periodicity Mining in Time Series Databases , 2012 .

[40]  Wei Zhang,et al.  PRED: Periodic Region Detection for Mobility Modeling of Social Media Users , 2017, WSDM.

[41]  Carlos Agón,et al.  Time-series data mining , 2012, CSUR.

[42]  Ronald K. Pearson,et al.  BMC Bioinformatics BioMed Central Methodology article , 2005 .

[43]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[44]  Esko Ukkonen,et al.  On-line construction of suffix trees , 1995, Algorithmica.

[45]  Philip S. Yu,et al.  Mining Asynchronous Periodic Patterns in Time Series Data , 2003, IEEE Trans. Knowl. Data Eng..