A method to select a good setting for the kNN algorithm when using it for breast cancer prognosis

Breast cancer is the world's second most frequent type of cancer and in Japan it is the third most frequent one. The prognosis of its recurrence, after a first treatment, is very important to increase the survival rate of a patient. This work shows the application of the k-Nearest Neighbors (kNN) method to prognosis breast cancer and also proposes a method to select a good setting with the parameters that can be changed when using this classification method. Using our method with the Wisconsin's breast cancer prognosis data, the kNN method has an average accuracy of 76%, a small standard deviation, and a small difference between its maximum and minimum values.

[1]  Dharminder Kumar,et al.  DATA MINING CLASSIFICATION TECHNIQUES APPLIED FOR BREAST CANCER DIAGNOSIS AND PROGNOSIS , 2011 .

[2]  Shweta Kharya,et al.  Using data mining techniques for diagnosis and prognosis of cancer disease , 2012, ArXiv.

[3]  M. B. Abdelhalim,et al.  Breast Cancer Diagnosis on Three Different Datasets Using Multi-Classifiers , 2012 .

[4]  Dr.G. Wiselin Jiji,et al.  An Efficient CBIR Approach for Diagnosing the Stages of Breast Cancer Using KNN Classifier , 2012 .

[5]  Larry Bull,et al.  Mining breast cancer data with XCS , 2007, GECCO '07.

[6]  Laurent Brisson,et al.  Breast cancer risk score: a data mining approach to improve readability , 2011, IEEE ICDM 2011.

[7]  Simone A. Ludwig,et al.  Prognosis of Breast Cancer Using Genetic Programming , 2010, KES.

[8]  William Nick Street,et al.  Breast Cancer Diagnosis and Prognosis Via Linear Programming , 1995, Oper. Res..

[9]  Hagit Shatkay,et al.  Breast Cancer Prognosis via Gaussian Mixture Regression , 2006, 2006 Canadian Conference on Electrical and Computer Engineering.

[10]  Tze-Yun Leong,et al.  Application of K-nearest neighbors algorithm on breast cancer diagnosis problem , 2000, AMIA.

[11]  Hiroshi Tanaka,et al.  Comparison of Seven Algorithms to Predict Breast Cancer Survival( Contribution to 21 Century Intelligent Technologies and Bioinformatics) , 2008 .

[12]  Abdelkader Benyettou,et al.  Breast Cancer Diagnosis by using k-Nearest Neighbor with Different Distances and Classification Rules , 2013 .

[13]  Arlindo L. Oliveira,et al.  A Data Mining Approach for the Detection of High-Risk Breast Cancer Groups , 2010, IWPACBB.

[14]  R. Geetha Ramani,et al.  Efficient Classifier for Classification of Prognostic Breast Cancer Data through Data Mining Techniques , 2012 .