Performance Comparisons between Unsupervised Clustering Techniques for Microarray Data Analysis on Ovarian Cancer

In this paper we present some performance comparisons of several unsupervised clustering techniques include: Self-Organizing Map (SOM), Fuzzy C-means (FCM) and hierarchical clustering, and they are employed to analyze the ovarian cancer microarray data. The data includes 15 samples with 9,600 genes and these samples include 5 benign ovarian tumors (OVT), 1 borderline ovarian malignancy (OVTT), 4 ovarian cancers at stage I (OVCAI), and 5 ovarian cancers at stage III (OVCAIII). A regression analysis is used to reduce the dimension and get 9600 residuals of genes. The genes with 100 largest and 100 smallest residual are picked to analyze using analysis of variance (ANOVA). After the ANOVA, 12 gene markers are got and can be used to distinguish OVT, OVTT, OVCAI and OVCAIII samples. The 12 gene markers are performed clustering by the SOM, FCM and hierarchical clustering techniques and to compare the results between these clustering techniques. Our experimental results show that the hierarchical clustering can get best performance of clustering and users do not need to define the number of clusters.

[1]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[2]  G. Rustin,et al.  Role of tumour markers in monitoring epithelial ovarian cancer , 2000, British Journal of Cancer.

[3]  Carl G. Looney,et al.  Pattern recognition using neural networks , 1997 .

[4]  Jorma Laaksonen,et al.  Variants of self-organizing maps , 1990, International 1989 Joint Conference on Neural Networks.

[5]  Partha S. Vasisht Computational Analysis of Microarray Data , 2003 .

[6]  E. Mizutani,et al.  Neuro-Fuzzy and Soft Computing-A Computational Approach to Learning and Machine Intelligence [Book Review] , 1997, IEEE Transactions on Automatic Control.

[7]  G. Gatta,et al.  Survival of European women with gynaecological tumours, during the period 1978-1989. EUROCARE Working Group. , 1998, European journal of cancer.

[8]  R. Bast,et al.  Elevation of serum CA125 in carcinomas of the fallopian tube, endometrium, and endocervix. , 1984, American journal of obstetrics and gynecology.

[9]  Kevin G. Becker,et al.  The sharing of cDNA microarray data , 2001, Nature Reviews Neuroscience.

[10]  John Quackenbush,et al.  Computational genetics: Computational analysis of microarray data , 2001, Nature Reviews Genetics.

[11]  Fionn Murtagh,et al.  Cluster Dissection and Analysis: Theory, Fortran Programs, Examples. , 1986 .

[12]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[13]  K. Cole,et al.  Histopathology and molecular biology of ovarian epithelial tumors. , 1998, Annals of diagnostic pathology.

[14]  Feng Luo,et al.  Hierarchical clustering of gene expression data , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[15]  Feng Luo,et al.  A dynamically growing self-organizing tree (DGSOT) for hierarchical clustering gene expression profiles , 2004, Bioinform..

[16]  P. Brown,et al.  DNA arrays for analysis of gene expression. , 1999, Methods in enzymology.

[17]  Y. Bignon,et al.  Major oncogenes and tumor suppressor genes involved in epithelial ovarian cancer (review). , 2000, International journal of oncology.