Microarray Leukemia Gene Data Clustering by Means of Generalized Self-organizing Neural Networks with Evolving Tree-Like Structures

The paper presents the application of our clustering technique based on generalized self-organizing neural networks with evolving tree-like structures to complex cluster-analysis problems including, in particular, the sample-based and gene-based clusterings of microarray Leukemia gene data set. Our approach works in a fully unsupervised way, i.e., without the necessity to predefine the number of clusters and using unlabelled data. It is particularly important in the gene-based clustering of microarray data for which the number of gene clusters is unknown in advance. In the sample-based clustering of the Leukemia data set, our approach gives better results than those reported in the literature and obtained using a method that requires the cluster number to be defined in advance. In the gene-based clustering of the considered data, our approach generates clusters that are easily divisible into subclusters related to particular sample classes. It corresponds, in a way, to subspace clustering that is highly desirable in microarray data analysis.