Breast Cancer Risk Prediction Using Different Clustering Techniques

Breast Cancer is one of the topmost well-known diseases with a high death rate among women. It is a non-communicable disease that is seen in numerous women in all over the world. With the early analysis of this disease, the endurance will arise from 56% to over 86%. In this analysis, several unsupervised learning techniques were used with the kernel techniques of Principle Component Analysis (PCA). K-Means and several Hierarchical Clustering techniques with different linkages such as ward, complete, and average were applied and highest accuracy of 70.91% was obtained from Hierarchical Clustering with average linkage. The better performances were in Recall and F1-score from K-Means compared to Ward and Complete linkage clustering techniques. The Specificity, Precision, Recall, and F1-score have shown satisfactory performances in Average linkage with the values of 60%, 70.58%, 80%, and 75% correspondingly.

[1]  Mojtaba Jamshidi,et al.  Breast Cancer Prediction Using a Hybrid Data Mining Model , 2019, JOIV : International Journal on Informatics Visualization.

[2]  Jiesheng Wang,et al.  KERNEL PRINCIPAL COMPONENT ANALYSIS: RADIAL BASIS FUNCTION NEURAL NETWORKS–BASED SOFT-SENSOR MODELING OF POLYMERIZING PROCESS OPTIMIZED BY CULTURAL DIFFERENTIAL EVOLUTION ALGORITHM , 2013 .

[3]  Jigar Patel,et al.  Diagnosis of Breast Cancer using Clustering Data Mining Approach , 2014 .

[4]  Niyas K Haneefa,et al.  Dendrogram based Clustering and Separation of Individual and Simultaneously Active Incipient Discharges in Transformer Insulation , 2020, 2020 International Conference on Signal Processing and Communications (SPCOM).

[5]  Graham J. Williams,et al.  Data Mining , 2000, Communications in Computer and Information Science.

[6]  L. Degner,et al.  Coping with breast cancer: a cluster analytic approach , 1999, Breast Cancer Research and Treatment.

[7]  Chien-Hsing Chen,et al.  A hybrid intelligent model of analyzing clinical breast cancer data using clustering techniques with feature selection , 2014, Appl. Soft Comput..

[8]  Etu Podder,et al.  Breast Cancer Risk Prediction using XGBoost and Random Forest Algorithm , 2020, 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT).

[9]  Zuherman Rustam,et al.  Classification of Breast Cancer using Fast Fuzzy Clustering based on Kernel , 2019, IOP Conference Series: Materials Science and Engineering.

[10]  Etu Podder,et al.  Prediction of Recurrence and Non-recurrence Events of Breast Cancer using Bagging Algorithm , 2020, 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT).