Breast Cancer Risk Prediction using XGBoost and Random Forest Algorithm

Breast cancer is as one of the common and serious cause of death among women globally. This is a disease where the cells grow out of control inside the breast. Family History of cancer disease, physical inactivity, psychological stress, increase in breast size are the risk factors of breast cancer. In this research paper, breast cancer dataset was analyzed to predict breast cancer using popular two ensemble machine learning algorithms. Random Forest and Extreme Gradient Boosting (XGBoost) were used to predict breast cancer. A total of 275 instances with 12 features were used for this analysis. With Random forest algorithm 74.73% accuracy and 73.63% using XGBoost had obtained in this analysis.

[1]  Hajar Mousannif,et al.  Using Machine Learning Algorithms for Breast Cancer Risk Prediction and Diagnosis , 2016, ANT/SEIT.

[2]  Burcu Bektaş,et al.  Machine learning based performance development for diagnosis of breast cancer , 2016, 2016 Medical Technologies National Congress (TIPTEKNO).

[3]  Nasrin Aktar,et al.  Diabetes Mellitus Prediction Using Ensemble Machine Learning Techniques , 2019, Advances in Computational Intelligence, Security and Internet of Things.

[4]  Saikat Mondal,et al.  A Comprehensive Analysis on Risk Prediction of Acute Coronary Syndrome Using Machine Learning Approaches , 2018, 2018 21st International Conference of Computer and Information Technology (ICCIT).

[5]  J. Anim,et al.  Breast cancer in sub-Saharan African women. , 1993, African journal of medicine and medical sciences.

[6]  Praveen Kumar,et al.  Breast Cancer Analysis Using WEKA , 2019, 2019 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence).

[7]  Uzair Khan,et al.  A Portable Thermogram based Non-contact Non-invasive Early Breast-Cancer Screening Device , 2018, 2018 IEEE Biomedical Circuits and Systems Conference (BioCAS).