Unsupervised gene selection using particle swarm optimization and k-means

Microarray experiments generate large scale data in the form of gene expression values. An unsupervised feature selection approach to perform sample based clustering on gene expression data is proposed. The proposed work uses Particle Swarm Optimization(PSO) for best subset generation and k-means as wrapper algorithm for evaluating the subsets. Clustering accuracy of 70-80% were obtained for different datasets.

[1]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[2]  JiangDaxin,et al.  Cluster Analysis for Gene Expression Data , 2004 .

[3]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[4]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[5]  Ka Yee Yeung,et al.  Principal component analysis for clustering gene expression data , 2001, Bioinform..

[6]  Aidong Zhang,et al.  Cluster analysis for gene expression data: a survey , 2004, IEEE Transactions on Knowledge and Data Engineering.