Fuzzy-rough attribute reduction via mutual information with an application to cancer classification

Establishing a classification model for cancer recognition based on DNA microarrays is useful for cancer diagnosis. Feature selection is a key step to perform cancer classification with DNA microarrays, for there is a large number of genes from which to predict classes and a relatively small number of samples. Automatic methods must be developed for extracting relevant genes which are essential for classification. This paper proposes a novel approach for reducing data redundancy based on fuzzy rough set theory and information theory. A mutual information-based algorithm for attribute reduction is suggested. The method is applied to the problem of gene selection for cancer classification. Experimental results show that the algorithm is more effective than conventional rough sets based approaches.

[1]  Hung Son Nguyen,et al.  On the Decision Table with Maximal Number of Reducts , 2003, RSKD.

[2]  Jiye Liang,et al.  Information entropy, rough entropy and knowledge granulation in incomplete information systems , 2006, Int. J. Gen. Syst..

[3]  Yiyu Yao,et al.  A Comparative Study of Fuzzy Sets and Rough Sets , 1998 .

[4]  L. Penland,et al.  Use of a cDNA microarray to analyse gene expression patterns in human cancer , 1996, Nature Genetics.

[5]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[6]  Chris H. Q. Ding,et al.  Analysis of gene expression profiles: class discovery and leaf ordering , 2002, RECOMB '02.

[7]  Xiao Di,et al.  Real Rough Set Theory and Attribute Reduction , 2007 .

[8]  Duoqian Miao,et al.  Analysis on attribute reduction strategies of rough set , 1998, Journal of Computer Science and Technology.

[9]  P. Langley Selection of Relevant Features in Machine Learning , 1994 .

[10]  Y. Yao,et al.  Information-Theoretic Measures for Knowledge Discovery and Data Mining , 2003 .

[11]  Dingfang Li,et al.  Gene Selection Using Rough Set Theory , 2006, RSKT.

[12]  Constantin V. Negoita,et al.  On Fuzzy Systems , 1978 .

[13]  Yiyu Yao,et al.  On Reduct Construction Algorithms , 2006, RSKT.

[14]  Wei-Zhi Wu,et al.  Generalized fuzzy rough sets , 2003, Inf. Sci..

[15]  Qiang Shen,et al.  Fuzzy-rough sets for descriptive dimensionality reduction , 2002, 2002 IEEE World Congress on Computational Intelligence. 2002 IEEE International Conference on Fuzzy Systems. FUZZ-IEEE'02. Proceedings (Cat. No.02CH37291).

[16]  Xiao Di Wu Real Rough Set Theory and Attribute Reduction , 2007 .

[17]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[18]  Roman W. Świniarski,et al.  Rough sets methods in feature reduction and classification , 2001 .

[19]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[20]  Yiyu Yao,et al.  User-Oriented Feature Selection for Machine Learning , 2007, Comput. J..

[21]  Qiang Shen,et al.  A modular approach to generating fuzzy rules with reduced attributes for the monitoring of complex systems , 2000 .

[22]  Miao Duo,et al.  An Information Representation of the Concepts and Operations in Rough Set Theory , 1999 .

[23]  Jiye Liang,et al.  The Algorithm on Knowledge Reduction in Incomplete Information Systems , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[24]  Wang Guo,et al.  Decision Table Reduction based on Conditional Information Entropy , 2002 .

[25]  Qinghua Hu,et al.  Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation , 2007, Pattern Recognit..

[26]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[28]  Xizhao Wang,et al.  On the generalization of fuzzy rough sets , 2005, IEEE Transactions on Fuzzy Systems.

[29]  Miao Duo,et al.  A HEURISTIC ALGORITHM FOR REDUCTION OF KNOWLEDGE , 1999 .

[30]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[31]  Fei-Yue Wang,et al.  Reduction and axiomization of covering generalized rough sets , 2003, Inf. Sci..

[32]  Wei-Zhi Wu,et al.  Approaches to knowledge reduction based on variable precision rough set model , 2004, Inf. Sci..

[33]  Yiyu Yao,et al.  Probabilistic approaches to rough sets , 2003, Expert Syst. J. Knowl. Eng..

[34]  D. Dubois,et al.  ROUGH FUZZY SETS AND FUZZY ROUGH SETS , 1990 .