Clustering Genomic Expression Data: Design and Evaluation Principles

This chapter has introduced key aspects of clustering systems for genomic expression data. An overview of the major types of clustering approaches, problems and design criteria was presented. It addressed the evaluation of clustering results and the prediction of optimal partitions. This problem, which has not traditionally received adequate attention from the expression research community, is crucial for the implementation of advanced clustering-based studies. A cluster evaluation framework may have a major impact on the generation of relevant and valid results. This paper shows how it may also support or guide biomedical knowledge discovery tasks. The clustering and validation techniques presented in this chapter may be applied to expression data of higher sample and feature set dimensionality.

[1]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[2]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[3]  Brian Everitt,et al.  Cluster analysis , 1974 .

[4]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[5]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[6]  K. Bremer COMBINABLE COMPONENT CONSENSUS , 1990, Cladistics : the international journal of the Willi Hennig Society.

[7]  Francisco Azuaje,et al.  A cluster validity framework for genome expression data , 2002, Bioinform..

[8]  Pierre Hansen,et al.  Cluster analysis and mathematical programming , 1997, Math. Program..

[9]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Francisco Azuaje,et al.  A computational neural approach to support the discovery of gene function and classes of cancer , 2001, IEEE Transactions on Biomedical Engineering.

[11]  F. Azuaje In silico approaches to microarray-based disease classification and gene function discovery , 2002, Annals of medicine.

[12]  N. Sampas,et al.  Molecular classification of cutaneous malignant melanoma by gene expression profiling , 2000, Nature.

[13]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[14]  James C. Bezdek,et al.  Some new indexes of cluster validity , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[15]  J. V. Ness,et al.  Admissible clustering procedures , 1971 .

[16]  Partha S. Vasisht Computational Analysis of Microarray Data , 2003 .

[17]  S. Dhanasekaran,et al.  Delineation of prognostic biomarkers in prostate cancer , 2001, Nature.

[18]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[19]  Roger E Bumgarner,et al.  Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. , 2001, Science.