Clustering Techniques from Significance Analysis of Microarrays

Microarray technology is a prominent tool that analyzes many thousands of gene expressions in a single experiment as well as to realize the primary genetic causes of various human diseases. There are abundant applications of this technology and its dataset is of high dimension and it is difficult to analyze the whole gene sets. In this paper, the SAM technique is used in a Golub microarray dataset which helps in identifying significant genes. Then the identified genes are clustered using three clustering techniques, namely, Hierarchical, k-means and Fuzzy C-means clustering algorithms. It helps in forming groups or clusters that share similar characteristics, which are useful when unknown dataset is used for analysis. From the results, it is shown that the hierarchical clustering performs well in exactly forming 27 samples in first cluster (ALL) and 11 samples in the second cluster (AML). They will provide an idea regarding the characteristics of the dataset.

[1]  Hui-Huang Hsu,et al.  Feature Selection for Cancer Classification on Microarray Expression Data , 2008, 2008 Eighth International Conference on Intelligent Systems Design and Applications.

[2]  Philippe Salembier,et al.  Feature set enhancement via hierarchical clustering for microarray classification , 2011, 2011 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS).

[3]  E. Southern,et al.  DNA microarrays. History and overview. , 2001, Methods in molecular biology.

[4]  Manali Kshirsagar,et al.  Role of Permutations in Significance Analysis of Microarray and Clustering of Significant Microarray Gene list , 2012 .

[5]  Pradipta Maji,et al.  Mutual Information-Based Supervised Attribute Clustering for Microarray Sample Classification , 2012, IEEE Transactions on Knowledge and Data Engineering.

[6]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Taysir Hassan A. Soliman,et al.  A gene selection approach for classifying diseases based on microarray datasets , 2010, 2010 2nd International Conference on Computer Technology and Development.

[8]  Y.-C. Lee,et al.  Feature selection and classification by using grid computing based evolutionary approach for the microarray data , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[9]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[10]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Robert Tibshirani,et al.  SAM “Significance Analysis of Microarrays” Users guide and technical document , 2002 .

[12]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.