On Resources Optimization in Fuzzy Clustering of Data Streams

In this paper the resource consumption of the fuzzy clustering algorithms for data streams is studied. As the examples, the wFCM and the wPCM algorithms are examined. It is shown that partitioning a data stream into chunks reduces the processing time of considered algorithms significantly. The partitioning procedure is accompanied with the reduction of results accuracy, however the change is acceptable. The problems arised due to the high speed data streams are presented as well. The uncontrolable growth of subsequent data chunk sizes, which leads to the overflow of the available memory, is demonstrated for both the wFCM and wPCM algorithms. The maximum chunk size limit modification, as a solution to this problem, is introduced. This modification ensures that the available memory is never exceeded, what is shown in the simulations. The considered modification decreases the quality of clustering results only slightly.

[1]  Jacek M. Zurada,et al.  Artificial Intelligence and Soft Computing, 10th International Conference, ICAISC 2010, Zakopane, Poland, June 13-17, 2010, Part I , 2010, International Conference on Artificial Intelligence and Soft Computing.

[2]  L. Rutkowski,et al.  A neuro-fuzzy controller with a compromise fuzzy reasoning , 2002 .

[3]  L. Rutkowski Non-parametric learning algorithms in time-varying environments☆ , 1989 .

[4]  Albert Bifet,et al.  Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams , 2010, Frontiers in Artificial Intelligence and Applications.

[5]  Philip S. Yu,et al.  Loadstar: A Load Shedding Scheme for Classifying Data Streams , 2005, SDM.

[6]  R. Nedunchezhian,et al.  Minig rules of concept drift using genetic algorithm , 2011 .

[7]  Charu C. Aggarwal,et al.  Data Streams - Models and Algorithms , 2014, Advances in Database Systems.

[8]  Philip S. Yu,et al.  Resource-Aware Mining with Variable Granularities in Data Streams , 2004, SDM.

[9]  Isak Gath,et al.  Unsupervised Optimal Fuzzy Clustering , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Sudipto Guha,et al.  Clustering data streams , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[11]  Ryszard Tadeusiewicz,et al.  Artificial Intelligence and Soft Computing - ICAISC 2006, 8th International Conference, Zakopane, Poland, June 25-29, 2006, Proceedings , 2006, International Conference on Artificial Intelligence and Soft Computing.

[12]  Philip S. Yu,et al.  A Framework for Projected Clustering of High Dimensional Data Streams , 2004, VLDB.

[13]  Leszek Rutlowski Sequential pattern recognition procedures derived from multiple Fourier series , 1988 .

[14]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[15]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[16]  Sudipto Guha,et al.  Clustering Data Streams: Theory and Practice , 2003, IEEE Trans. Knowl. Data Eng..

[17]  Madjid Khalilian,et al.  Data Stream Clustering: Challenges and Issues , 2010, ArXiv.

[18]  Mohamed Medhat Gaber,et al.  On-board Mining of Data Streams in Sensor Networks , 2005 .

[19]  Mohamed Medhat Gaber,et al.  Adaptive mining techniques for data streams using algorithm output granularity , 2003 .

[20]  Leszek Rutkowski,et al.  Neural Networks and Soft Computing , 2003 .

[21]  Leszek Rutkowski,et al.  A general approach to neuro-fuzzy systems , 2001, 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297).

[22]  Janusz T. Starczewski,et al.  Interval Type 2 Neuro-Fuzzy Systems Based on Interval Consequents , 2003 .

[23]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[24]  Janusz T. Starczewski,et al.  Connectionist Structures of Type 2 Fuzzy Inference Systems , 2001, PPAM.

[25]  L. Rutkowski Application of multiple Fourier series to identification of multivariable non-stationary systems , 1989 .

[26]  Mohamed Medhat Gaber,et al.  Resource-Aware Ubiquitous Data Stream Querying , 2005 .

[27]  Charu C. Aggarwal,et al.  Data Streams: Models and Algorithms (Advances in Database Systems) , 2006 .

[28]  Lawrence O. Hall,et al.  Single Pass Fuzzy C Means , 2007, 2007 IEEE International Fuzzy Systems Conference.

[29]  Rafal Scherer Boosting Ensemble of Relational Neuro-fuzzy Systems , 2006, ICAISC.

[30]  Rajeev Motwani,et al.  Load shedding for aggregation queries over data streams , 2004, Proceedings. 20th International Conference on Data Engineering.

[31]  R. Nowicki Nonlinear modelling and classification based on the MICOG defuzzification , 2009 .

[32]  Rafal Scherer,et al.  Neuro-fuzzy Systems with Relation Matrix , 2010, ICAISC.

[33]  Mohamed Medhat Gaber,et al.  Resource-aware Mining of Data Streams , 2005, J. Univers. Comput. Sci..

[34]  James M. Keller,et al.  A possibilistic approach to clustering , 1993, IEEE Trans. Fuzzy Syst..

[35]  L. Rutkowski Real-time identification of time-varying systems by non-parametric algorithms based on Parzen kernels , 1985 .