The CRASSS plug-in for integrating annotation data with hierarchical clustering results

We describe an algorithm for finding the most statistically significant non-overlapping subtrees of a hierarchical clustering of gene expression data with respect to a set of secondary data labels on genes. The method is implemented as a Java plug-in for a commercial gene expression analysis program (GeneSpring).

[1]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Yaning Yang,et al.  Statistical significance for hierarchical clustering in genetic association and microarray expression studies , 2003, BMC Bioinformatics.

[3]  Ronald W. Davis,et al.  The core meiotic transcriptome in budding yeasts , 2000, Nature Genetics.

[4]  J. I The Design of Experiments , 1936, Nature.

[5]  David G. Morris,et al.  Global analysis of gene expression in pulmonary fibrosis reveals distinct programs regulating lung inflammation and fibrosis. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[6]  G. S. Johnson,et al.  An Information-Intensive Approach to the Molecular Pharmacology of Cancer , 1997, Science.

[7]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[8]  D. Lockhart,et al.  Expression monitoring by hybridization to high-density oligonucleotide arrays , 1996, Nature Biotechnology.

[9]  M. Bittner,et al.  Gene expression profiling of alveolar rhabdomyosarcoma with cDNA microarrays. , 1998, Cancer research.

[10]  P. Khatri,et al.  Global functional profiling of gene expression. , 2003, Genomics.

[11]  Vladimir Svetnik,et al.  STATISTICAL ANALYSIS OF HIGH DENSITY OLIGONUCLEOTIDE ARRAYS: A SAFER APPROACH , 2001 .

[12]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[13]  D. Lockhart,et al.  Analysis of gene expression profiles in normal and neoplastic ovarian tissue samples identifies candidate molecular markers of epithelial ovarian cancer. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[14]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[15]  G. W. Snedecor Statistical Methods , 1964 .

[16]  J. Shaffer Multiple Hypothesis Testing , 1995 .

[17]  E. Pitman SIGNIFICANCE TESTS WHICH MAY BE APPLIED TO SAMPLES FROM ANY POPULATIONS III. THE ANALYSIS OF VARIANCE TEST , 1938 .