Plaid models for gene expression data

Motivated by genetic expression data, we introduce plaid models. These are a form of two-sided cluster analysis that allows clusters to overlap. Plaid models also incorporate additive two way ANOVA models within the two-sided clusters. Using these models we find interpretable structure in some yeast expression data, as well as in some nutrition data and some foreign exchange data.

[1]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[2]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[3]  G. Nemhauser,et al.  Integer Programming , 2020 .

[4]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[5]  Roger N. Shepard,et al.  Additive clustering: Representation of similarities as combinations of discrete overlapping properties. , 1979 .

[6]  John J. Bertin,et al.  The semiology of graphics , 1983 .

[7]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[8]  Phipps Arabie,et al.  AN OVERVIEW OF COMBINATORIAL DATA ANALYSIS , 1996 .

[9]  Pierre Hansen,et al.  Cluster analysis and mathematical programming , 1997, Math. Program..

[10]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[11]  D. Botstein,et al.  The transcriptional program of sporulation in budding yeast. , 1998, Science.

[12]  Thomas Hofmann,et al.  Learning from Dyadic Data , 1998, NIPS.

[13]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[14]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Boris G. Mirkin,et al.  Least-Squares Structuring, Clustering and Data Processing Issues , 1998, Comput. J..

[16]  Tamara G. Kolda,et al.  A semidiscrete matrix decomposition for latent semantic indexing information retrieval , 1998, TOIS.

[17]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[18]  R. Tibshirani,et al.  Clustering methods for the analysis of DNA microarray data , 1999 .

[19]  Ron Shamir,et al.  Clustering Gene Expression Patterns , 1999, J. Comput. Biol..

[20]  Zohar Yakhini,et al.  Clustering gene expression patterns , 1999, J. Comput. Biol..

[21]  D. Botstein,et al.  Singular value decomposition for genome-wide expression data processing and modeling. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Gary A. Churchill,et al.  Analysis of Variance for Gene Expression Microarray Data , 2000, J. Comput. Biol..

[23]  G. Getz,et al.  Coupled two-way clustering analysis of gene microarray data. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[24]  G. Churchill,et al.  Statistical design and the analysis of gene expression microarray data. , 2001, Genetical research.

[25]  Terence P. Speed,et al.  Normalization for cDNA microarry data , 2001, SPIE BiOS.

[26]  M. Oh,et al.  Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. , 2001, Nucleic acids research.

[27]  Jin Hyun Park,et al.  Normalization for cDNA Microarray Data on the oral cancer , 2002 .