Rough set based gene selection algorithm for microarray sample classification

Gene selection from microarray data is an important issue for gene expression based classification and to carry out a diagnostic test. In this regard, a rough set based gene selection algorithm is presented. It selects the set of genes by maximizing the relevance and significance of the genes, which are calculated based on the theory of rough sets. Using the predictive accuracy of K-nearest neighbor rule and support vector machine, the performance of the proposed algorithm, along with a comparison with other related methods is studied on five cancer and two arthritis microarray data sets. Promising performance was achieved by the proposed gene selection algorithm with relevant and significant genes from microarray data set in a reasonable time.

[1]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[3]  R. Spang,et al.  Predicting the clinical status of human breast cancer by using gene expression profiles , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Li Wang,et al.  Hybrid huberized support vector machines for microarray classification and gene selection , 2008, Bioinform..

[5]  Dominik Slezak,et al.  Roughfication of Numeric Decision Tables: The Case Study of Gene Expression Data , 2007, RSKT.

[6]  Jerzy W. Grzymala-Busse,et al.  Mining of MicroRNA Expression Data - A Rough Set Approach , 2006, RSKT.

[7]  Dominik Slezak Rough Sets and Few-Objects-Many-Attributes Problem: The Case Study of Analysis of Gene Expression Data Sets , 2007, 2007 Frontiers in the Convergence of Bioscience and Information Technologies.

[8]  Pradipta Maji,et al.  $f$-Information Measures for Efficient Selection of Discriminative Genes From Microarray Data , 2009, IEEE Transactions on Biomedical Engineering.

[9]  Wojciech Ziarko,et al.  Variable Precision Rough Set Model , 1993, J. Comput. Syst. Sci..

[10]  Daijin Kim,et al.  Data classification based on tolerant rough set , 2001, Pattern Recognit..

[11]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[12]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[13]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[14]  S. Ramaswamy,et al.  Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. , 2002, Cancer research.

[15]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[16]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[17]  Ash A. Alizadeh,et al.  Rheumatoid arthritis is a heterogeneous disease: evidence for differences in the activation of the STAT-1 pathway between rheumatoid tissues. , 2003, Arthritis and rheumatism.

[18]  Qiang Shen,et al.  Centre for Intelligent Systems and Their Applications Fuzzy Rough Attribute Reduction with Application to Web Categorization Fuzzy Rough Attribute Reduction with Application to Web Categorization Fuzzy Sets and Systems ( ) – Fuzzy–rough Attribute Reduction with Application to Web Categorization , 2022 .

[19]  Julio J. Valdés,et al.  Relevant Attribute Discovery in High Dimensional Data: Application to Breast Cancer Gene Expressions , 2006, RSKT.

[20]  C. Wijbrandts,et al.  Rheumatoid arthritis subtypes identified by genomic profiling of peripheral blood cells: assignment of a type I interferon signature in a subpopulation of patients , 2007, Annals of the rheumatic diseases.

[21]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[22]  Qiang Shen,et al.  Rough set-aided keyword reduction for text categorization , 2001, Appl. Artif. Intell..

[23]  Pradipta Maji,et al.  Rough set based maximum relevance-maximum significance criterion and Gene selection from microarray data , 2011, Int. J. Approx. Reason..