Constructing a Virtual Space for Enhancing the Classification Performance of Fuzzy Clustering

Clustering offers a general methodology and comes with a remarkably rich conceptual and algorithmic framework for data analysis and data interpretation. As one of the most representative algorithms of fuzzy clustering, fuzzy C-means (FCM) is a widely used objective function-based clustering method exploited in various applications. In this study, a virtual-based fuzzy clustering algorithm is proposed to improve the classification performance coming as a result of using fuzzy clustering. This improvement is achieved by forming a virtual space based on the original data space. First, we construct a piecewise linear transformation function to modify the similarity matrix of the original data and build the so-called virtual similarity matrix (VSM). Considering the VSM, the effect of closeness becomes amplified; in other words, high similarity values (say, larger than α which is a cutoff value of the large and small similarity in this paper) present in the original similarity matrix are made higher, whereas lower similarity levels (say, smaller than α) are further reduced. In addition, data with high similarity (say, larger than a certain threshold value) observed in the original space will overlap (the attributes of the samples are exactly the same) significantly in the virtual space; the overlapping samples can be treated as one sample. This modification makes possible easier to identify clusters. Second, we build a relationship matrix between the original dataset and the determined similarity values and present two closed-form solutions to the problem of building the relationship matrix. Subsequently, a virtual space of the original data space is derived through the modified similarity matrix and the introduced relationship matrix. We offer a thorough analysis behind the developed clustering algorithm. The experimental results are in agreement with the underlying conceptual basis. Furthermore, the resulting classification performance is significantly improved compared with the results produced by the FCM and the kernel-based fuzzy C-means.

[1]  Hu Ya-ting A Kernel Based Fuzzy Clustering Algorithm , 2008 .

[2]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[3]  Li Xiaofeng,et al.  A study on settlements of “Eight-trigram” forms in Gaoyao area cases study on Yangang Village and Licha Village , 2011, 2011 International Conference on Electric Technology and Civil Engineering (ICETCE).

[4]  Madhu Nashipudimath,et al.  Comparative Analysis Of Fuzzy Clustering Algorithms In Data Mining , 2012 .

[5]  Jiamin Li,et al.  Fuzzy Clustering Algorithms — Review of the Applications , 2016, 2016 IEEE International Conference on Smart Cloud (SmartCloud).

[6]  Witold Pedrycz,et al.  Fuzzy clustering with nonlinearly transformed data , 2017, Appl. Soft Comput..

[7]  James C. Bezdek,et al.  Extending Information-Theoretic Validity Indices for Fuzzy Clustering , 2017, IEEE Transactions on Fuzzy Systems.

[8]  Witold Pedrycz,et al.  Kernel-based fuzzy clustering and fuzzy clustering: A comparative experimental study , 2010, Fuzzy Sets Syst..

[9]  L. Hubert,et al.  Comparing partitions , 1985 .

[10]  Suhash C. Dutta Roy Some Fundamental Issues Related to the Impulse Function , 2016 .

[11]  Marimuthu Palaniswami,et al.  Fuzzy c-Means Algorithms for Very Large Data , 2012, IEEE Transactions on Fuzzy Systems.

[12]  D. Steinley Properties of the Hubert-Arabie adjusted Rand index. , 2004, Psychological methods.

[13]  R. Sivakumar,et al.  A study on possibilistic and fuzzy possibilistic C-means clustering algorithms for data clustering , 2012, 2012 International Conference on Emerging Trends in Science, Engineering and Technology (INCOSET).

[14]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[15]  Nicolás Marín,et al.  Fuzzy certainty on fuzzy values , 2009 .

[16]  Nidhi Grover A study of various Fuzzy Clustering Algorithms , 2014 .

[17]  Hu-Chen Liu,et al.  Fuzzy Petri nets for knowledge representation and reasoning: A literature review , 2017, Eng. Appl. Artif. Intell..

[18]  Witold Pedrycz,et al.  Enhancement of the classification and reconstruction performance of fuzzy C-means with refinements of prototypes , 2017, Fuzzy Sets Syst..

[19]  Witold Pedrycz,et al.  High-Accuracy Signal Subspace Separation Algorithm Based on Gaussian Kernel Soft Partition , 2019, IEEE Transactions on Industrial Electronics.

[20]  Li Chen,et al.  Fast kernel fuzzy c-means algorithms based on difference of convex programming , 2016, 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD).

[21]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[22]  Lili Yu,et al.  Improved FCM algorithm based on the initial clustering center selection , 2013, 2013 3rd International Conference on Consumer Electronics, Communications and Networks.

[23]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[24]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[25]  Lawrence Hubert,et al.  The variance of the adjusted Rand index. , 2016, Psychological methods.

[26]  Hu-Chen Liu,et al.  Linguistic Petri Nets Based on Cloud Model Theory for Knowledge Representation and Reasoning , 2018, IEEE Transactions on Knowledge and Data Engineering.

[27]  Yishu Zhai,et al.  Multiscale edge detection based on fuzzy c-means clustering , 2006, 2006 1st International Symposium on Systems and Control in Aerospace and Astronautics.

[28]  Shitong Wang,et al.  Attribute weighted mercer kernel based fuzzy clustering algorithm for general non-spherical datasets , 2006, Soft Comput..

[29]  Abhishek Singhal,et al.  A comparative study of K-Means, K-Means++ and Fuzzy C-Means clustering algorithms , 2017, 2017 3rd International Conference on Computational Intelligence & Communication Technology (CICT).

[30]  Robert LIN,et al.  NOTE ON FUZZY SETS , 2014 .

[31]  Dao-Qiang Zhang,et al.  Clustering Incomplete Data Using Kernel-Based Fuzzy C-means Algorithm , 2003, Neural Processing Letters.

[32]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[33]  I. Burhan Türksen,et al.  Validation criteria for enhanced fuzzy clustering , 2008, Pattern Recognit. Lett..

[34]  Witold Pedrycz,et al.  Enhancement of fuzzy clustering by mechanisms of partial supervision , 2006, Fuzzy Sets Syst..

[35]  Thierry Denoeux,et al.  Dissimilarity Metric Learning in the Belief Function Framework , 2016, IEEE Transactions on Fuzzy Systems.

[36]  Nasser Ghadiri,et al.  BigFCM: Fast, precise and scalable FCM on hadoop , 2016, Future Gener. Comput. Syst..

[37]  Siti Zaiton Mohd Hashim,et al.  Robust Local Triangular Kernel density-based clustering for high-dimensional data , 2013, 2013 5th International Conference on Computer Science and Information Technology.

[38]  Shie-Jue Lee,et al.  A neuro-fuzzy system modeling with self-constructing rule generationand hybrid SVD-based learning , 2003, IEEE Trans. Fuzzy Syst..

[39]  Pintu Chandra Shill,et al.  Incorporating gene ontology into fuzzy relational clustering of microarray gene expression data , 2018, Biosyst..