Rough sets and genetic algorithms: A hybrid approach to breast cancer classification

The use of computational intelligence systems such as rough sets, neural networks, fuzzy set, genetic algorithms, etc., for predictions and classification has been widely established. This paper presents a generic classification model based on a rough set approach and decision rules. To increase the efficiency of the classification process, boolean reasoning discretization algorithm is used to discretize the data sets. The approach is tested by a comparative study of three different classifiers (decision rules, naive bayes and k-nearest neighbor) over three distinct discretization techniques (equal bigning, entropy and boolean reasoning). The rough set reduction technique is applied to find all the reducts of the data which contains the minimal subset of attributes that are associated with a class label for prediction. In this paper we adopt the genetic algorithms approach to reach reducts. Finally, decision rules were used as a classifier to evaluate the performance of the predicted reducts and classes. To evaluate the performance of our approach, we present tests on breast cancer data set. The experimental results obtained, show that the overall classification accuracy offered by the employed rough set approach and decision rules is high compared with other classification techniques including Bayes and k-nearest neighbor.

[1]  Wen-Yau Liang,et al.  The generic genetic algorithm incorporates with rough set theory - An application of the web services composition , 2009, Expert Syst. Appl..

[2]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[3]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[4]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[5]  Toshinori Munakata,et al.  Fundamentals of the new artificial intelligence - beyond traditional paradigms , 2001, Graduate texts in computer science.

[6]  Hussein A. Abbass,et al.  An evolutionary artificial neural networks approach for breast cancer diagnosis , 2002, Artif. Intell. Medicine.

[7]  김현철 [서평]「Data Mining Techniques : For Marketing, Sales, and Customer Support」 , 1999 .

[8]  Jerzy W. Grzymala-Busse,et al.  Global discretization of continuous attributes as preprocessing for machine learning , 1996, Int. J. Approx. Reason..

[9]  Aboul Ella Hassanien,et al.  Rough neural intelligent approach for image classification: A case of patients with suspected breast cancer , 2006, Int. J. Hybrid Intell. Syst..

[10]  Vijay V. Raghavan,et al.  A comparison of feature selection algorithms in the context of rough classifiers , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[11]  Andrzej Skowron,et al.  Rough sets and Boolean reasoning , 2007, Inf. Sci..

[12]  Bo Yang,et al.  Hybrid Neurocomputing for Breast Cancer Detection , 2005, WSTST.

[13]  Witold Pedrycz,et al.  Data Mining Methods for Knowledge Discovery , 1998, IEEE Trans. Neural Networks.

[14]  Yuehjen E. Shao,et al.  Mining the breast cancer pattern using artificial neural networks and multivariate adaptive regression splines , 2004, Expert Syst. Appl..

[15]  Alicja Wakulicz-Deja,et al.  Visualization of Rough Set Decision Rules for Medical Diagnosis Systems , 2009, RSFDGrC.

[16]  Aboul Ella Hassanien,et al.  Rough Computing: Theories, Technologies and Applications , 2007 .

[17]  Ravi Jain,et al.  A Comparative Study of Fuzzy Classifiers on Breast Cancer Data , 2009, IWANN.

[18]  C ONG,et al.  Building credit scoring models using genetic programming , 2005, Expert Syst. Appl..

[19]  Lei Zhang,et al.  Research of Neural Network Classifier Based on FCM and PSO for Breast Cancer Classification , 2012, HAIS.

[20]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[21]  Ta-Cheng Chen,et al.  A GAs based approach for mining breast cancer pattern , 2006, Expert Syst. Appl..

[22]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[23]  A. Abraham,et al.  ENSEMBLE OF FLEXIBLE NEURAL TREES FOR BREAST CANCER DETECTION , 2006 .

[24]  Jan G. Bazan Chapter 17 a Comparison of Dynamic and Non{dynamic Rough Set Methods for Extracting Laws from Decision Tables , 1998 .

[25]  Ravinder Nath,et al.  Determining the saliency of input variables in neural network classifiers , 1997, Comput. Oper. Res..

[26]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[27]  Gustavo E. A. P. A. Batista,et al.  An analysis of four missing data treatment methods for supervised learning , 2003, Appl. Artif. Intell..

[28]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[29]  Ning Zhong,et al.  Using Rough Sets with Heuristics for Feature Selection , 1999, RSFDGrC.

[30]  Iu V Petrov [Detection and treatment of breast cancer]. , 1966, Meditsinskaia sestra.

[31]  Aleksander Øhrn,et al.  Discernibility and Rough Sets in Medicine: Tools and Applications , 2000 .

[32]  Thomas E. McKee,et al.  Genetic programming and rough sets: A hybrid approach to bankruptcy classification , 2002, Eur. J. Oper. Res..

[33]  Chengqi Zhang,et al.  Data preparation for data mining , 2003, Appl. Artif. Intell..

[34]  Parag C. Pendharkar,et al.  Association, statistical, mathematical and neural approaches for mining breast cancer patterns , 1999 .

[35]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.