Clustering methods for microarray gene expression data.

Within the field of genomics, microarray technologies have become a powerful technique for simultaneously monitoring the expression patterns of thousands of genes under different sets of conditions. A main task now is to propose analytical methods to identify groups of genes that manifest similar expression patterns and are activated by similar conditions. The corresponding analysis problem is to cluster multi-condition gene expression data. The purpose of this paper is to present a general view of clustering techniques used in microarray gene expression data analysis.

[1]  Ricardo J. G. B. Campello,et al.  Evolving clusters in gene-expression data , 2006, Inf. Sci..

[2]  Jian Zhang,et al.  Gene expression profiling of ovarian tissues for determination of molecular pathways reflective of tumorigenesis. , 2006, Journal of molecular biology.

[3]  Hong Yan,et al.  Pattern recognition techniques for the emerging field of bioinformatics: A review , 2005, Pattern Recognit..

[4]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[5]  G. C. Tseng,et al.  A comparative review of gene clustering in expression profile , 2004, ICARCV 2004 8th Control, Automation, Robotics and Vision Conference, 2004..

[6]  Aidong Zhang,et al.  Cluster analysis for gene expression data: a survey , 2004, IEEE Transactions on Knowledge and Data Engineering.

[7]  Yi Lu,et al.  Incremental genetic K-means algorithm and its application in gene expression data analysis , 2004, BMC Bioinformatics.

[8]  Michael Peacock,et al.  Hierarchical Clustering Analysis of Tissue Microarray Immunostaining Data Identifies Prognostically Significant Groups of Breast Carcinoma , 2004, Clinical Cancer Research.

[9]  A Coldman,et al.  Evaluation of immunohistochemical markers in non‐small cell lung cancer by unsupervised hierarchical clustering analysis: a tissue microarray study of 284 cases and 18 markers , 2004, The Journal of pathology.

[10]  Nabil Belacel,et al.  Fuzzy J-Means and VNS methods for clustering genes from microarray data , 2004, Bioinform..

[11]  Huan Liu,et al.  Subspace clustering for high dimensional data: a review , 2004, SKDD.

[12]  Yi Lu,et al.  FGKA: a Fast Genetic K-means Clustering Algorithm , 2004, SAC '04.

[13]  Mikko T. Kolehmainen,et al.  Data exploration with self-organizing maps in environmental informatics and bioinformatics , 2004 .

[14]  Eivind Hovig,et al.  Tumor classification and marker gene prediction by feature selection and fuzzy c-means clustering using microarray data , 2003, BMC Bioinformatics.

[15]  Geoffrey J. McLachlan,et al.  Model-Based Clustering in Gene Expression Microarrays: An Application to Breast Cancer Data , 2003, APBC.

[16]  Jian Pei,et al.  Towards interactive exploration of gene expression patterns , 2003, SKDD.

[17]  Saman K. Halgamuge,et al.  An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data , 2003, Bioinform..

[18]  Ron Shamir,et al.  CLICK and EXPANDER: a system for clustering and visualizing gene expression data , 2003, Bioinform..

[19]  M. J. van der Laan,et al.  A new partitioning around medoids algorithm , 2003 .

[20]  Paul S Mischel,et al.  Gene expression profiling identifies molecular subtypes of gliomas , 2003, Oncogene.

[21]  Doulaye Dembélé,et al.  Fuzzy C-means Method for Clustering Microarray Data , 2003, Bioinform..

[22]  Russell C. Eberhart,et al.  Gene clustering using self-organizing maps and particle swarm optimization , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[23]  Jian Pei,et al.  DHC: a density-based hierarchical clustering method for time series gene expression data , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[24]  G. Pertea,et al.  Comparative Analyses of Potato Expressed Sequence Tag Libraries1 , 2003, Plant Physiology.

[25]  Junbai Wang,et al.  Clustering of the SOM easily reveals distinct gene expression patterns: results of a reanalysis of lymphoma study , 2002, BMC Bioinformatics.

[26]  M. Eisen,et al.  Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering , 2002, Genome Biology.

[27]  Nikola Kasabov,et al.  Fuzzy clustering of gene expression data , 2002, 2002 IEEE World Congress on Computational Intelligence. 2002 IEEE International Conference on Fuzzy Systems. FUZZ-IEEE'02. Proceedings (Cat. No.02CH37291).

[28]  Abdelghani Bellaachia,et al.  E-CAST: A Data Mining Algorithm for Gene Expression Data , 2002, BIOKDD.

[29]  Joaquín Dopazo,et al.  Combining hierarchical clustering and self-organizing maps for exploratory analysis of gene expression patterns. , 2002, Journal of proteome research.

[30]  M. King,et al.  BRCA1 transcriptionally regulates genes involved in breast tumorigenesis , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Geoffrey J. McLachlan,et al.  A mixture model-based approach to the clustering of microarray expression data , 2002, Bioinform..

[32]  Adrian E. Raftery,et al.  Model-based clustering and data transformations for gene expression data , 2001, Bioinform..

[33]  Ron Shamir,et al.  A clustering algorithm based on graph connectivity , 2000, Inf. Process. Lett..

[34]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[35]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[36]  Ash A. Alizadeh,et al.  'Gene shaving' as a method for identifying distinct sets of genes with similar expression patterns , 2000, Genome Biology.

[37]  G. Getz,et al.  Coupled two-way clustering analysis of gene microarray data. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[38]  G. Sherlock Analysis of large-scale gene expression data. , 2000, Current opinion in immunology.

[39]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[40]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[41]  Vipin Kumar,et al.  Chameleon: Hierarchical Clustering Using Dynamic Modeling , 1999, Computer.

[42]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[43]  M. Narasimha Murty,et al.  Genetic K-means algorithm , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[44]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[45]  P. Törönen,et al.  Analysis of gene expression data using self‐organizing maps , 1999, FEBS letters.

[46]  Zohar Yakhini,et al.  Clustering gene expression patterns , 1999, J. Comput. Biol..

[47]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[48]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[49]  D. Botstein,et al.  The transcriptional program of sporulation in budding yeast. , 1998, Science.

[50]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[51]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[52]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[53]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[54]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[55]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[56]  B. Morgan,et al.  Non-uniqueness and Inversions in Cluster Analysis , 1995 .

[57]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[58]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[59]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[60]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[61]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[62]  Connor W. McEntee,et al.  The DIURNAL project: DIURNAL and circadian expression profiling, model-based pattern matching, and promoter analysis. , 2007, Cold Spring Harbor symposia on quantitative biology.

[63]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[64]  E. Lander,et al.  A molecular signature of metastasis in primary solid tumors , 2003, Nature Genetics.

[65]  Ka Yee Yeung,et al.  Clustering or Automatic Class Discovery: Non-Hierarchical, Non-Som , 2003 .

[66]  Derek C. Stanford,et al.  Clustering or Automatic Class Discovery: Hierarchical Methods , 2003 .

[67]  Werner Dubitzky,et al.  A Practical Approach to Microarray Data Analysis , 2003, Springer US.

[68]  D. P. Mercer,et al.  Clustering large datasets , 2003 .

[69]  R. Sharan,et al.  Cluster analysis and its applications to gene expression data. , 2002, Ernst Schering Research Foundation workshop.

[70]  Habtom W. Ressom,et al.  Double self-organizing maps to cluster gene expression data , 2002, ESANN.

[71]  Mu-Chun Su,et al.  A new model of self-organizing neural networks and its application in data projection , 2001, IEEE Trans. Neural Networks.

[72]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[73]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[74]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[75]  D. Botstein,et al.  For Personal Use. Only Reproduce with Permission from the Lancet Publishing Group , 2022 .