Comparative Analysis of Data Mining Algorithms for Cancer Gene Expression Data

Cancer is amongst the most challenging disorders to diagnose nowadays, and experts are still struggling to detect it on early stage. Gene selection is significant for identifying cancer-causing different parameters. The two deadliest cancers namely, colorectal cancer and breast malignant, is found in male and female, respectively. This study aims at predicting the cancer at an early stage with the help of cancer bioinformatics. According to the complexity of illness metabolic rates, signaling, and interaction, cancer bioinformatics is among strategies to focus bioinformatics technologies like data mining in cancer detection. The goal of the proposed study is to make a comparison between support vector machine, random forest, decision tree, artificial neural network, and logistic regression for the prediction of cancer malignant gene expression data. For analyzing data against algorithms, WEKA is used. The findings show that smart computational data mining techniques could be used to detect cancer recurrence in patients. Finally, the strategies that yielded the best results were identified.

[1]  Chengzhong Xing,et al.  Screening of Pathogenic Genes for Colorectal Cancer and Deep Learning in the Diagnosis of Colorectal Cancer , 2020, IEEE Access.

[2]  Prashant Mathur,et al.  Cancer Statistics, 2020: Report From National Cancer Registry Programme, India , 2020, JCO global oncology.

[3]  S. Jeyalatha,et al.  Diagnosis of Breast Cancer using Decision Tree Data Mining Technique , 2014 .

[4]  Rajender Singh Chhillar,et al.  A Review of Data Mining Optimization Techniques for Bioinformatics Applications , 2020 .

[5]  Roohallah Alizadehsani,et al.  Detection of effective genes in colon cancer: A machine learning approach , 2021 .

[6]  Ahmed Hamza Osman,et al.  An Effective of Ensemble Boosting Learning Method for Breast Cancer Virtual Screening Using Neural Network Model , 2020, IEEE Access.

[7]  Azian Azamimi Abdullah,et al.  Classification of Benign and Malignant Breast Cancer using Supervised Machine Learning Algorithms Based on Image and Numeric Datasets , 2019, Journal of Physics: Conference Series.

[8]  Gugulothu Narsimha,et al.  Diagnosis of Lung Cancer Prediction System Using Data Mining Classification Techniques , 2013 .

[9]  Rajender Singh Chhillar,et al.  Disease Predictive Models for Healthcare by using Data Mining Techniques: State of the Art , 2020 .

[10]  Tajul Islam,et al.  Cancer Disease Prediction Using Naive Bayes,K-Nearest Neighbor and J48 algorithm , 2019, 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT).

[11]  Nishtha Hooda,et al.  Prediction of Malignant Breast Cancer Cases Using Ensemble Machine Learning: A Case Study of Pesticides Prone Area , 2020, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  R. S. Chhillar,et al.  Analyzing Predictive Algorithms in Data Mining for Cardiovascular Disease using WEKA Tool , 2021, International Journal of Advanced Computer Science and Applications.

[13]  M. van der Schaar,et al.  Application of a novel machine learning framework for predicting non-metastatic prostate cancer-specific mortality in men using the Surveillance, Epidemiology, and End Results (SEER) database. , 2021, The Lancet. Digital health.

[14]  Marjan Kuchaki Rafsanjani,et al.  A Review on Lung Cancer Diagnosis Using Data Mining Algorithms. , 2020, Current medical imaging.

[15]  Harikumar Rajaguru,et al.  Analysis of Decision Tree and K-Nearest Neighbor Algorithm in the Classification of Breast Cancer , 2019, Asian Pacific journal of cancer prevention : APJCP.

[16]  Ziemowit Klimonda,et al.  Early Prediction of Response to Neoadjuvant Chemotherapy in Breast Cancer Sonography Using Siamese Convolutional Neural Networks , 2020, IEEE Journal of Biomedical and Health Informatics.

[17]  T. Jayasankar,et al.  Big Data based breast cancer prediction using kernel support vector machine with the Gray Wolf Optimization algorithm , 2021 .

[18]  Hadi Kazemi-Arpanahi,et al.  Comparison of Four Data Mining Algorithms for Predicting Colorectal Cancer Risk , 2021 .

[19]  Ali Bonyadi Naeini,et al.  Modeling and comparing data mining algorithms for prediction of recurrence of breast cancer , 2020, PloS one.