A Fuzzy Classifier for Data Streams with Infinitely Delayed Labels

In data stream learning, classification is a prominent task which aims to predict the class labels of incoming examples. However, in classification, most of the approaches from literature make assumptions that limit the usefulness of the methods in real scenarios such as the supposition that the label of an example will be available right after its prediction, i.e., there is no time delay to acquiring actual labels. It is a very optimistic assumption, since labeling the entire data stream is usually not feasible. Some recent approaches overcome this limitation, considering unsupervised learning methods to deal with delayed labels. Also, some proposals explore concepts of fuzzy set theory to add more flexibility to the learning process, although restricted to data streams with no delayed labels. In this paper, we propose a fuzzy classifier for data streams with infinitely delayed labels called FuzzMiC. Our algorithm generates a model based on fuzzy micro-clusters that provides flexible class boundaries and allows the classification of evolving data streams. Experiments show that our approach is promising in dealing with incremental changes.

[1]  Heloisa A. Camargo,et al.  FuzzStream: Fuzzy data stream clustering based on the online-offline framework , 2017, 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[2]  João Gama,et al.  Classification of Evolving Data Streams with Infinitely Delayed Labels , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[3]  João Gama,et al.  Data Stream Classification Guided by Clustering on Nonstationary Environments and Extreme Verification Latency , 2015, SDM.

[4]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[5]  Michaela M. Black,et al.  The Impact of Latency on Online Classification Learning with Concept Drift , 2010, KSEM.

[6]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[7]  Sattar Hashemi,et al.  Flexible decision tree for data stream classification in the presence of concept change, noise and missing values , 2009, Data Mining and Knowledge Discovery.

[8]  Robi Polikar,et al.  COMPOSE: A Semisupervised Learning Framework for Initially Labeled Nonstationary Streaming Data , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Georg Krempl,et al.  The Algorithm APT to Classify in Concurrence of Latency and Drift , 2011, IDA.

[10]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[11]  Heloisa A. Camargo,et al.  A Fuzzy Variant for On-Demand Data Stream Classification , 2017, 2017 Brazilian Conference on Intelligent Systems (BRACIS).