A Novel Feature Selection Method for Effective Breast Cancer Diagnosis and Prognosis

A major area of current research in data mining is the field of medical diagnosis. In the present study using the Breast cancer Wisconsin data sets, a feature selection algorithm Modified Correlation Rough Set Feature Selection (MCRSFS) predicts both diagnosis and prognosis by comparing several data mining classification algorithms. In the proposed approach, in level 1 of feature selection, features are selected based on rough set with different starting values of reduct. In level 2 features are selected from the reduced set based on the Correlation Feature Selection (CFS). Experiments show the proposed method is effective by comparing with others in terms of number of selected features and classification performance. General Terms Pattern Recognition, Machine learning.

[1]  H. Sittek,et al.  Computer-aided diagnosis in mammography , 1997, Der Radiologe.

[2]  K. Usha Rani,et al.  ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS , 2011 .

[3]  Aboul Ella Hassanien,et al.  Rough Computing: Theories, Technologies and Applications , 2007 .

[4]  D. Chen,et al.  Breast cancer diagnosis using self-organizing map for sonography. , 2000, Ultrasound in medicine & biology.

[5]  Rudy Setiono,et al.  Generating concise and accurate classification rules for breast cancer diagnosis , 2000, Artif. Intell. Medicine.

[6]  H. P. Ambulgekar,et al.  Approach of Neural Network to Diagnose Breast Cancer on three different Data Set , 2009, 2009 International Conference on Advances in Recent Technologies in Communication and Computing.

[7]  T. Sridevi,et al.  An Intelligent Classifier for Breast Cancer Diagnosis based on K-Means Clustering and Rough Set , 2014 .

[8]  Maryellen L. Giger,et al.  Computer-Aided Diagnosis in Mammography , 2000 .

[9]  Richard L. Van Metter,et al.  Handbook of Medical Imaging , 2009 .

[10]  K. Usha Rani,et al.  ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA , 2012 .

[11]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[12]  C. Floyd,et al.  A neural network approach to breast cancer diagnosis as a constraint satisfaction problem. , 2001, Medical physics.

[13]  Qiang Shen,et al.  A Rough Set-Aided System for Sorting WWW Bookmarks , 2001, Web Intelligence.

[14]  John Wang,et al.  Data Mining Software , 2008 .

[15]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .