On Fuzzy Clustering of Data Streams with Concept Drift

In the paper the clustering algorithms based on fuzzy set theory are considered. Modifications of the Fuzzy C-Means and the Possibilistic C-Means algorithms are presented, which adjust them to deal with data streams. Since data stream is of infinite size, it has to be partitioned into chunks. Simulations show that this partitioning procedure does not affect the quality of clustering results significantly. Moreover, properly chosen weights can be assigned to each data element. This modification allows the presented algorithms to handle concept drift during simulations.

[1]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[2]  L. Rutkowski Non-parametric learning algorithms in time-varying environments☆ , 1989 .

[3]  Sadaaki Miyamoto,et al.  Algorithms for Fuzzy Clustering - Methods in c-Means Clustering with Applications , 2008, Studies in Fuzziness and Soft Computing.

[4]  Sudipto Guha,et al.  Clustering data streams , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[5]  Charu C. Aggarwal,et al.  Data Streams: Models and Algorithms (Advances in Database Systems) , 2006 .

[6]  R. Nowicki Nonlinear modelling and classification based on the MICOG defuzzification , 2009 .

[7]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[8]  Ryszard Tadeusiewicz,et al.  Artificial Intelligence and Soft Computing - ICAISC 2006, 8th International Conference, Zakopane, Poland, June 25-29, 2006, Proceedings , 2006, International Conference on Artificial Intelligence and Soft Computing.

[9]  Rafal Scherer,et al.  Neuro-fuzzy Systems with Relation Matrix , 2010, ICAISC.

[10]  Leszek Rutlowski Sequential pattern recognition procedures derived from multiple Fourier series , 1988 .

[11]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[12]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[13]  L. Rutkowski,et al.  A neuro-fuzzy controller with a compromise fuzzy reasoning , 2002 .

[14]  Albert Bifet,et al.  Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams , 2010, Frontiers in Artificial Intelligence and Applications.

[15]  Charu C. Aggarwal,et al.  Data Streams - Models and Algorithms , 2014, Advances in Database Systems.

[16]  Isak Gath,et al.  Unsupervised Optimal Fuzzy Clustering , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Lawrence O. Hall,et al.  Single Pass Fuzzy C Means , 2007, 2007 IEEE International Fuzzy Systems Conference.

[18]  Rafal Scherer Boosting Ensemble of Relational Neuro-fuzzy Systems , 2006, ICAISC.

[19]  Madjid Khalilian,et al.  Data Stream Clustering: Challenges and Issues , 2010, ArXiv.

[20]  Jacek M. Zurada,et al.  Artificial Intelligence and Soft Computing, 10th International Conference, ICAISC 2010, Zakopane, Poland, June 13-17, 2010, Part I , 2010, International Conference on Artificial Intelligence and Soft Computing.

[21]  Leszek Rutkowski,et al.  Neural Networks and Soft Computing , 2003 .

[22]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[23]  R. Nedunchezhian,et al.  Minig rules of concept drift using genetic algorithm , 2011 .

[24]  Janusz T. Starczewski,et al.  Interval Type 2 Neuro-Fuzzy Systems Based on Interval Consequents , 2003 .

[25]  H. L. Le Roy,et al.  Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; Vol. IV , 1969 .

[26]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[27]  Leszek Rutkowski,et al.  A general approach to neuro-fuzzy systems , 2001, 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297).

[28]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  James M. Keller,et al.  A possibilistic approach to clustering , 1993, IEEE Trans. Fuzzy Syst..

[30]  L. Rutkowski Real-time identification of time-varying systems by non-parametric algorithms based on Parzen kernels , 1985 .

[31]  Robert Babuska,et al.  Fuzzy Modeling for Control , 1998 .

[32]  Janusz T. Starczewski,et al.  Connectionist Structures of Type 2 Fuzzy Inference Systems , 2001, PPAM.

[33]  Leszek Rutkowski,et al.  Computational intelligence - methods and techniques , 2008 .

[34]  L. Rutkowski Application of multiple Fourier series to identification of multivariable non-stationary systems , 1989 .

[35]  Philip S. Yu,et al.  A Framework for Projected Clustering of High Dimensional Data Streams , 2004, VLDB.