A Pattern-Based Bayesian Classifier for Data Stream

An advanced approach to Bayesian classification is based on exploited patterns. However, traditional pattern-based Bayesian classifiers cannot adapt to the evolving data stream environment. For that, an effective Pattern-based Bayesian classifier for Data Stream (PBDS) is proposed. First, a data-driven lazy learning strategy is employed to discover local frequent patterns for each test record. Furthermore, we propose a summary data structure for compact representation of data, and to find patterns more efficiently for each class. Greedy search and minimum description length combined with Bayesian network are applied to evaluating extracted patterns. Experimental studies on real-world and synthetic data streams show that PBDS outperforms most state-of-the-art data stream classifiers.

[1]  Zhihai Wang,et al.  A lazy associative classifier for time series , 2015, Intell. Data Anal..

[2]  Zhihai Wang,et al.  Online Ensemble Using Adaptive Windowing for Data Streams with Concept Drift , 2016, Int. J. Distributed Sens. Networks.

[3]  Dimitris Meretakis,et al.  Extending naïve Bayes classifiers using long itemsets , 1999, KDD '99.

[4]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[5]  João Gama,et al.  Learning Decision Rules from Data Streams , 2011, IJCAI.

[6]  Suh-Yin Lee,et al.  DSM-FI: an efficient algorithm for mining frequent itemsets in data streams , 2008, Knowledge and Information Systems.

[7]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[8]  Geoff Holmes,et al.  Efficient data stream classification via probabilistic adaptive windows , 2013, SAC '13.

[9]  Jiawei Han,et al.  Discriminative Frequent Pattern Analysis for Effective Classification , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[10]  Luca Cagliero,et al.  EnBay: A Novel Pattern-Based Bayesian Classifier , 2013, IEEE Transactions on Knowledge and Data Engineering.

[11]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[12]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[13]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[14]  Jean Paul Barddal,et al.  A Survey on Ensemble Learning for Data Stream Classification , 2017, ACM Comput. Surv..