Fuzzy Clustering Systems in Analyzing High Dimensional Database

Finding the division between malignant pleural mesothelioma (MPM) and adenocarcinoma (ADCA) from the gene expression of lung cancer database is difficult due to its high-dimensionality gene with noise. This paper proposes novel effective fuzzy soft clustering systems with the combination of possibilistic c-means to distinct the MPM and ADCA accurately gene expression ratios of lung cancer database. Since the proposed method is capable in clustering highly correlated gene expression of lung cancer database, first time all 181 tissue samples are used for finding MPM and ADCA during the experimental works using the proposed method. The performance of proposed method in clustering the Lung cancer database is shown through the clustering accuracy and error matrix.

[1]  James M. Keller,et al.  A possibilistic fuzzy c-means clustering algorithm , 2005, IEEE Transactions on Fuzzy Systems.

[2]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[3]  Xiaowen Li,et al.  Performance research of Gaussian function weighted fuzzy C-means algorithm , 2007, International Symposium on Multispectral Image Processing and Pattern Recognition.

[4]  M. Jeżewski Application of modified fuzzy clustering to medical data classification , 2011 .

[5]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[6]  Jiye Liang,et al.  An initialization method to simultaneously find initial cluster centers and the number of clusters for clustering categorical data , 2011, Knowl. Based Syst..

[7]  S. Ramaswamy,et al.  Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. , 2002, Cancer research.

[8]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[9]  Kwang-Hyun Park,et al.  Iterative Bayesian fuzzy clustering toward flexible icon-based assistive software for the disabled , 2010, Inf. Sci..

[10]  Shehroz S. Khan,et al.  Cluster center initialization algorithm for K-means clustering , 2004, Pattern Recognit. Lett..

[11]  Mariagrazia Dotoli,et al.  Fuzzy Clustering - A Versatile Mean to Explore Medical Databases. , 2000 .

[12]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[13]  Andrea Schenone,et al.  A fuzzy clustering based segmentation system as support to diagnosis in medical imaging , 1999, Artif. Intell. Medicine.

[14]  Miin-Shen Yang,et al.  A Gaussian kernel-based fuzzy c-means algorithm with a spatial bias correction , 2008, Pattern Recognit. Lett..

[15]  I. Dhillon,et al.  Coclustering of Human Cancer Microarrays Using Minimum Sum-Squared Residue Coclustering , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[16]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[17]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[18]  Dervis Karaboga,et al.  Fuzzy clustering with artificial bee colony algorithm , 2010 .

[19]  Inderjit S. Dhillon,et al.  Minimum Sum-Squared Residue Co-Clustering of Gene Expression Data , 2004, SDM.

[20]  Shunzhi Zhu,et al.  Data clustering with size constraints , 2010, Knowl. Based Syst..