gene‐CBR: A CASE‐BASED REASONIG TOOL FOR CANCER DIAGNOSIS USING MICROARRAY DATA SETS

Gene expression profiles are composed of thousands of genes at the same time, representing the complex relationships between them. One of the well‐known constraints specifically related to microarray data is the large number of genes in comparison with the small number of available experiments or cases. In this context, the ability of design methods capable of overcoming current limitations of state‐of‐the‐art algorithms is crucial to the development of successful applications. This paper presents gene‐CBR, a hybrid model that can perform cancer classification based on microarray data. The system employs a case‐based reasoning model that incorporates a set of fuzzy prototypes, a growing cell structure network and a set of rules to provide an accurate diagnosis. The hybrid model has been implemented and tested with microarray data belonging to bone marrow cases from forty‐three adult patients with cancer plus a group of six cases corresponding to healthy persons.

[1]  Juan M. Corchado,et al.  Quantifying the Ocean's CO2 Budget with a CoHeL-IBR System , 2004, ECCBR.

[2]  Gregory Piatetsky-Shapiro,et al.  Microarray data mining: facing the challenges , 2003, SKDD.

[3]  Bernd Fritzke,et al.  Growing cell structures--A self-organizing network for unsupervised and supervised learning , 1994, Neural Networks.

[4]  E. Costello,et al.  A Case-Based Approach to Gene Finding , 2003 .

[5]  Francisco Azuaje,et al.  Discovering relevance knowledge in data: a growing cell structures approach , 2000, IEEE Trans. Syst. Man Cybern. Part B.

[6]  Ian D. Watson,et al.  Applying case-based reasoning - techniques for the enterprise systems , 1997 .

[7]  Igor Jurisica,et al.  Applications of Case-Based Reasoning in Molecular Biology , 2004, AI Mag..

[8]  King-Sun Fu,et al.  Feature Selection in Pattern Recognition , 1970, IEEE Trans. Syst. Sci. Cybern..

[9]  J. R. Quinlan,et al.  Data Mining Tools See5 and C5.0 , 2004 .

[10]  Jude Shavlik,et al.  Finding Genes by Case-Based Reasoning in the Presence of Noisy Case Boundaries * , 1991 .

[11]  S. P. Fodor,et al.  High density synthetic oligonucleotide arrays , 1999, Nature Genetics.

[12]  Igor Jurisica,et al.  Data mining for case-based reasoning in high-dimensional biological domains , 2005, IEEE Transactions on Knowledge and Data Engineering.

[13]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[14]  G. Christian Overton,et al.  Knowledge Discovery in GENBANK , 1993, ISMB.

[15]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[16]  Juan M. Corchado,et al.  Improving Gene Selection in Microarray Data Analysis Using Fuzzy Patterns Inside a CBR System , 2005, ICCBR.

[17]  Juan M. Corchado,et al.  Maximum Likelihood Hebbian Learning Based Retrieval Method for CBR Systems , 2003, ICCBR.

[18]  Jean Lieber,et al.  Case-Based Reasoning for Breast Cancer Treatment Decision Helping , 2000, EWCBR.

[19]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Sankar K. Pal,et al.  Soft Computing in Case Based Reasoning , 2000, Springer London.