Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.

Oligonucleotide arrays can provide a broad picture of the state of the cell, by monitoring the expression level of thousands of genes at the same time. It is of interest to develop techniques for extracting useful information from the resulting data sets. Here we report the application of a two-way clustering method for analyzing a data set consisting of the expression patterns of different cell types. Gene expression in 40 tumor and 22 normal colon tissue samples was analyzed with an Affymetrix oligonucleotide array complementary to more than 6,500 human genes. An efficient two-way clustering algorithm was applied to both the genes and the tissues, revealing broad coherent patterns that suggest a high degree of organization underlying gene expression in these tissues. Coregulated families of genes clustered together, as demonstrated for the ribosomal proteins. Clustering also separated cancerous from noncancerous tissue and cell lines from in vivo tissues on the basis of subtle distributed patterns of genes even when expression of individual genes varied only slightly between the tissues. Two-way clustering thus may be of use both in classifying genes into functional groups and in classifying tissues based on gene expression.

[1]  R. Thomas,et al.  Boolean formalization of genetic control circuits. , 1973, Journal of theoretical biology.

[2]  Rose,et al.  Statistical mechanics and phase transitions in clustering. , 1990, Physical review letters.

[3]  I. Wool,et al.  Ribosomal protein genes are overexpressed in colorectal cancer: isolation of a cDNA clone encoding the human S3 ribosomal protein , 1991, Molecular and cellular biology.

[4]  P. Shaw,et al.  Induction of apoptosis by wild-type p53 in a human colon tumor-derived cell line. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[5]  M. Boguski,et al.  dbEST — database for “expressed sequence tags” , 1993, Nature Genetics.

[6]  L. Penland,et al.  Use of a cDNA microarray to analyse gene expression patterns in human cancer , 1996, Nature Genetics.

[7]  C. Auffray,et al.  Novel gene transcripts preferentially expressed in human muscles revealed by quantitative hybridization of a high density cDNA array. , 1996, Genome research.

[8]  L. Wodicka,et al.  Genome-wide expression monitoring in Saccharomyces cerevisiae , 1997, Nature Biotechnology.

[9]  G. S. Johnson,et al.  An Information-Intensive Approach to the Molecular Pharmacology of Cancer , 1997, Science.

[10]  J. Barker,et al.  Large-scale temporal gene expression mapping of central nervous system development. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[11]  D. Botstein,et al.  The transcriptional program of sporulation in budding yeast. , 1998, Science.

[12]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Carlo M. Croce,et al.  The Biology of Tumors , 1998, Pezcoller Foundation Symposia.

[14]  P. Brown,et al.  Drug target validation and identification of secondary drug target effects using DNA microarrays , 1998, Nature Medicine.

[15]  D. Botstein,et al.  The transcriptional program in the response of human fibroblasts to serum. , 1999, Science.