Computational intelligence techniques for acute leukemia gene expression data classification

Recent advances in microarray technologies have allowed scientists to discover and monitor the mRNA transcript levels of thousands of genes in a single experiment. The data obtained from microarray studies present a challenge to data analysis. In this paper, we design an expression-based classification method for acute leukemia. Different dimension reduction techniques are considered to tackle the very high dimensionality of this kind of data. Subsequently, the classification system employs artificial neural networks. The comparative results reported, indicate that high classification rates are possible and moreover that subsets of features that contribute significantly to the success of the neural classifiers can be identified.

[1]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[2]  J. Barker,et al.  Large-scale temporal gene expression mapping of central nervous system development. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Kuldip K. Paliwal,et al.  Fast K-dimensional tree algorithms for nearest neighbor search with application to vector quantization encoding , 1992, IEEE Trans. Signal Process..

[4]  Aidong Zhang,et al.  Cluster analysis for gene expression data: a survey , 2004, IEEE Transactions on Knowledge and Data Engineering.

[5]  Bernard Chazelle,et al.  Filtering search: A new approach to query-answering , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[6]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[7]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[8]  Roded Sharan,et al.  CLICK: A Clustering Algorithm for Gene Expression Analysis , 2000, ISMB 2000.

[9]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[10]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[11]  Dimitris K. Tasoulis,et al.  Unsupervised Clustering of Bioinformatics Data , 2004 .

[12]  Chris H. Q. Ding,et al.  Analysis of gene expression profiles: class discovery and leaf ordering , 2002, RECOMB '02.

[13]  Dimitris K. Tasoulis,et al.  UNSUPERVISED CLUSTER ANALYSIS IN BIOINFORMATICS , 2004 .

[14]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[15]  Franco P. Preparata,et al.  Sequencing-by-hybridization revisited: the analog-spectrum proposal , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[16]  J. Thomas,et al.  An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. , 2001, Genome research.

[17]  George D. Magoulas,et al.  Hybrid methods using evolutionary algorithms for on-line training , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[18]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[19]  Luis Mateus Rocha,et al.  Singular value decomposition and principal component analysis , 2003 .

[20]  Philip S. Yu,et al.  Fast algorithms for projected clustering , 1999, SIGMOD '99.

[21]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Michael N. Vrahatis,et al.  The New k-Windows Algorithm for Improving the k-Means Clustering Algorithm , 2002, J. Complex..

[23]  Richard S. Sutton,et al.  Online Learning with Random Representations , 1993, ICML.

[24]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Hermann A. Maurer,et al.  Efficient worst-case data structures for range searching , 1978, Acta Informatica.

[26]  Richard M. Karp,et al.  CLIFF: clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts , 2001, ISMB.