An effective graph-based clustering technique to identify coherent patterns from gene expression data

This paper presents an effective parameter-less graph based clustering technique (GCEPD). GCEPD produces highly coherent clusters in terms of various cluster validity measures. The technique finds highly coherent patterns containing genes with high biological relevance. Experiments with real life datasets establish that the method produces clusters that are significantly better than other similar algorithms in terms of various quality measures.

[1]  Seungjin Choi,et al.  Clustering with r-regular graphs , 2009, Pattern Recognit..

[2]  R. Shamir,et al.  An algorithm for clustering cDNA fingerprints. , 2000, Genomics.

[3]  Jin Xu,et al.  A Graph-Based Approach for Clustering Analysis of Gene Expression Data by Using Topological Features , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[4]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Dhruba Kumar Bhattacharyya,et al.  An Effective Density-Based Hierarchical Clustering Technique to Identify Coherent Patterns from Gene Expression Data , 2011, PAKDD.

[6]  Jugal K. Kalita,et al.  A new approach for clustering gene expression time series data , 2009, Int. J. Bioinform. Res. Appl..

[7]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Mario Vento,et al.  A Graph-Based Clustering Method and Its Applications , 2007, BVAI.

[9]  Dov Stekel,et al.  Microarray Bioinformatics: Appendix: MIAME Glossary , 2003 .

[10]  Laurie J. Heyer,et al.  Exploring expression data: identification and analysis of coexpressed genes. , 1999, Genome research.

[11]  D. Botstein,et al.  The transcriptional program in the response of human fibroblasts to serum. , 1999, Science.

[12]  Raktim Sinha,et al.  MeV: MultiExperiment Viewer , 2010 .

[13]  Alfonso Valencia,et al.  A hierarchical unsupervised growing neural network for clustering gene expression patterns , 2001, Bioinform..

[14]  Abdelghani Bellaachia,et al.  E-CAST: A Data Mining Algorithm for Gene Expression Data , 2002, BIOKDD.

[15]  Pascal Nsoh,et al.  Large-scale temporal gene expression mapping of central nervous system development , 2007 .

[16]  J. Kalita,et al.  Highly Coherent Pattern Identification Using Graph-based Clustering , 2010 .

[17]  Rainer Breitling,et al.  Graph-based iterative Group Analysis enhances microarray interpretation , 2004, BMC Bioinformatics.

[18]  Francis D. Gibbons,et al.  Judging the quality of gene expression-based clustering methods using gene annotation. , 2002, Genome research.

[19]  Jugal Kalita,et al.  CLUSTERING GENE EXPRESSION DATA USING AN EFFECTIVE DISSIMILARITY MEASURE 1 , 2010 .

[20]  Aidong Zhang,et al.  Cluster analysis for gene expression data: a survey , 2004, IEEE Transactions on Knowledge and Data Engineering.

[21]  Chris Sander,et al.  Characterizing gene sets with FuncAssociate , 2003, Bioinform..

[22]  Roded Sharan,et al.  Center CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis , 2000, ISMB.

[23]  Jian Pei,et al.  DHC: a density-based hierarchical clustering method for time series gene expression data , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[24]  Pablo M. Granitto,et al.  Clustering gene expression data with a penalized graph-based metric , 2011, BMC Bioinformatics.

[25]  J. Kalita,et al.  A Frequent Itemset – Nearest Neighbor Based Approach for Clustering Gene Expression Data , 2009 .

[26]  Ron Shamir,et al.  CLICK and EXPANDER: a system for clustering and visualizing gene expression data , 2003, Bioinform..

[27]  R. Sharan,et al.  CLICK: a clustering algorithm with applications to gene expression analysis. , 2000, Proceedings. International Conference on Intelligent Systems for Molecular Biology.

[28]  Dhruba K. Bhattacharyya,et al.  An Effective Technique for Clustering Incremental Gene Expression data , 2010 .

[29]  Zohar Yakhini,et al.  Clustering gene expression patterns , 1999, J. Comput. Biol..

[30]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[31]  Mario Vento,et al.  Assessing the Performance of a Graph-Based Clustering Algorithm , 2007, GbRPR.