Unsupervised gene selection via spectral biclustering

Selection of significant genes via expression patterns is an important problem in microarray data processing. In this article, we propose and study a new method for selecting relevant genes obtained by spectral biclustering and based on similarity between genes and eigenvectors. The proposed algorithm can select a much smaller gene subset to make accurate predictions. The unsupervised gene selection method suggested in This work is demonstrated on two microarray cancer data sets, i.e., the lymphoma and the liver cancer data sets. In both examples, our method is able to identify two-gene combinations which can lead to prediction with very high accuracy.

[1]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[2]  Chris H. Q. Ding,et al.  Unsupervised Feature Selection Via Two-way Ordering in Gene Expression Analysis , 2003, Bioinform..

[3]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[4]  Rolf Adams,et al.  Seeded Region Growing , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[6]  D. Botstein,et al.  Gene expression patterns in human liver cancers. , 2002, Molecular biology of the cell.

[7]  Richard M. Karp,et al.  CLIFF: clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts , 2001, ISMB.

[8]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[9]  Wentian Li,et al.  How Many Genes are Needed for a Discriminant Microarray Data Analysis , 2001, physics/0104029.

[10]  M. Xiong,et al.  Biomarker Identification by Feature Wrappers , 2022 .

[11]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[13]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[14]  Chris H. Q. Ding,et al.  Analysis of gene expression profiles: class discovery and leaf ordering , 2002, RECOMB '02.

[15]  Joseph T. Chang,et al.  Spectral biclustering of microarray data: coclustering genes and conditions. , 2003, Genome research.

[16]  Trevor Hastie,et al.  Class Prediction by Nearest Shrunken Centroids, with Applications to DNA Microarrays , 2003 .