A novel machine learning approach for early detection of hepatocellular carcinoma patients

Abstract Liver cancer is quite common type of cancer among individuals worldwide. Hepatocellular carcinoma (HCC) is the malignancy of liver cancer. It has high impact on individual’s life and investigating it early can decline the number of annual deaths. This study proposes a new machine learning approach to detect HCC using 165 patients. Ten well-known machine learning algorithms are employed. In the preprocessing step, the normalization approach is used. The genetic algorithm coupled with stratified 5-fold cross-validation method is applied twice, first for parameter optimization and then for feature selection. In this work, support vector machine (SVM) (type C-SVC) with new 2level genetic optimizer (genetic training) and feature selection yielded the highest accuracy and F1-Score of 0.8849 and 0.8762 respectively. Our proposed model can be used to test the performance with huge database and aid the clinicians.

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  James A. Bartholomai,et al.  Prediction of lung cancer patient survival via supervised machine learning classification techniques , 2017, Int. J. Medical Informatics.

[3]  Simon Haykin,et al.  Neural Networks and Learning Machines , 2010 .

[4]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[5]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[6]  Pawe Pawiak,et al.  Novel methodology of cardiac health recognition based on ECG signals and evolutionary-neural system , 2018 .

[7]  Miriam Seoane Santos,et al.  A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients , 2015, J. Biomed. Informatics.

[8]  S. Hariharan,et al.  Image Analysis for the Detection and Diagnosis of Hepatocellular Carcinoma from Abdominal CT Images , 2018 .

[9]  Fang Liu,et al.  Data Processing and Text Mining Technologies on Electronic Medical Records: A Review , 2018, Journal of healthcare engineering.

[10]  Moloud Abdar,et al.  Using PSO Algorithm for Producing Best Rules in Diagnosis of Heart Disease , 2017, 2017 International Conference on Computer and Applications (ICCA).

[11]  G. Cabibbo,et al.  Multimodal approaches to the treatment of hepatocellular carcinoma , 2009, Nature Clinical Practice Gastroenterology &Hepatology.

[12]  Ying-Hsiu Su,et al.  Development and Evaluation of Novel Statistical Methods in Urine Biomarker-Based Hepatocellular Carcinoma Screening , 2018, Scientific Reports.

[13]  Mohamed-Slim Alouini,et al.  Asymptotic performance of regularized quadratic discriminant analysis based classifiers , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).

[14]  Pawel Plawiak,et al.  Novel genetic ensembles of classifiers applied to myocardium dysfunction recognition based on ECG signals , 2017, Swarm Evol. Comput..

[15]  Shihui Ying,et al.  Multimodal Neuroimaging Feature Learning With Multimodal Stacked Deep Polynomial Networks for Diagnosis of Alzheimer's Disease , 2018, IEEE Journal of Biomedical and Health Informatics.

[16]  Chih-Fong Tsai,et al.  A class center based approach for missing value imputation , 2018, Knowl. Based Syst..

[17]  Karim Keshavjee,et al.  A Systematic Machine Learning Based Approach for the Diagnosis of Non-Alcoholic Fatty Liver Disease Risk and Progression , 2018, Scientific Reports.

[18]  Moloud Abdar,et al.  Performance analysis of classification algorithms on early detection of liver disease , 2017, Expert Syst. Appl..

[19]  Christopher P. Long,et al.  Hexokinase-2 depletion inhibits glycolysis and induces oxidative phosphorylation in hepatocellular carcinoma and sensitizes to metformin , 2018, Nature Communications.

[20]  Juan Manuel Cueva Lovelle,et al.  An approach to improve the accuracy of probabilistic classifiers for decision support systems in sentiment analysis , 2017, Appl. Soft Comput..

[21]  Yasuharu Tokuda,et al.  Effectiveness of a clinical knowledge support system for reducing diagnostic errors in outpatient care in Japan: A retrospective study , 2018, Int. J. Medical Informatics.

[22]  D. Cox The Regression Analysis of Binary Sequences , 1958 .

[23]  Pawel Plawiak An estimation of the state of consumption of a positive displacement pump based on dynamic pressure or vibrations using neural networks , 2014, Neurocomputing.

[24]  Ganesh R. Naik,et al.  Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks , 2017, Front. Neurosci..

[25]  Zhen Wang,et al.  Identification of Hepatocellular Carcinoma-Related Genes with a Machine Learning and Network Analysis , 2015, J. Comput. Biol..

[26]  Moloud Abdar,et al.  Improving the Diagnosis of Liver Disease Using Multilayer Perceptron Neural Network and Boosted Decision Trees , 2018 .

[27]  Wen-Hsien Ho,et al.  Mortality Predicted Accuracy for Hepatocellular Carcinoma Patients with Hepatic Resection Using Artificial Neural Network , 2013, TheScientificWorldJournal.

[28]  Moloud Abdar,et al.  Rule Optimization of Boosted C5.0 Classification Using Genetic Algorithm for Liver disease Prediction , 2017, 2017 International Conference on Computer and Applications (ICCA).

[29]  Marko Debeljak,et al.  Using data mining techniques to model primary productivity from international long-term ecological research (ILTER) agricultural experiments in Austria , 2018, Regional Environmental Change.

[30]  B. Miller,et al.  Improving Diagnosis in Health Care , 2015 .

[31]  Strother H. Walker,et al.  Estimation of the probability of an event as a function of several independent variables. , 1967, Biometrika.

[32]  Ravi Shankar,et al.  A Firefly Algorithm Based Wrapper-Penalty Feature Selection Method for Cancer Diagnosis , 2018, ICCSA.

[33]  Ryszard Tadeusiewicz,et al.  Neural Networks In Mining Sciences – General Overview And Some Representative Examples , 2015 .

[34]  Chien-Liang Liu,et al.  A predictive model for acute allograft rejection of liver transplantation , 2018, Expert Syst. Appl..

[35]  U. Rajendra Acharya,et al.  Automated diagnosis of arrhythmia using combination of CNN and LSTM techniques with variable length heart beats , 2018, Comput. Biol. Medicine.

[36]  Sang Won Yoon,et al.  Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms , 2014, Expert Syst. Appl..

[37]  Waldemar W. Koczkodaj,et al.  Supporting a Medical Diagnostic Process by Selected AI Methods : an Asperger Syndrome Case Study , 2008 .

[38]  Te-Wei Ho,et al.  Disease-Free Survival after Hepatic Resection in Hepatocellular Carcinoma Patients: A Prediction Approach Using Artificial Neural Network , 2012, PloS one.

[39]  Mamun Bin Ibne Reaz,et al.  A novel SVM-kNN-PSO ensemble method for intrusion detection system , 2016, Appl. Soft Comput..

[40]  U. Rajendra Acharya,et al.  Deep learning for healthcare applications based on physiological signals: A review , 2018, Comput. Methods Programs Biomed..

[41]  Aaron C. Abajian,et al.  Predicting Treatment Response to Intra-arterial Therapies for Hepatocellular Carcinoma with the Use of Supervised Machine Learning-An Artificial Intelligence Concept. , 2018, Journal of vascular and interventional radiology : JVIR.

[42]  Pei-Chann Chang,et al.  A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification , 2011, Appl. Soft Comput..

[43]  Jieping Ye,et al.  Identifying Genetic Risk Factors for Alzheimer's Disease via Shared Tree-Guided Feature Learning Across Multiple Tasks , 2018, IEEE Transactions on Knowledge and Data Engineering.

[44]  Jian Ren,et al.  A de novo substructure generation algorithm for identifying the privileged chemical fragments of liver X receptorβ agonists , 2017, Scientific Reports.

[45]  Alireza Askarzadeh,et al.  A Memory-Based Genetic Algorithm for Optimization of Power Generation in a Microgrid , 2018, IEEE Transactions on Sustainable Energy.

[46]  Harish Dureja,et al.  Topological Models for Prediction of Pharmacokinetic Parameters of Cephalosporins using Random Forest, Decision Tree and Moving Average Analysis , 2008 .

[47]  Wen-Hsien Ho,et al.  Comparison of Artificial Neural Network and Logistic Regression Models for Predicting In-Hospital Mortality after Primary Liver Cancer Surgery , 2012, PloS one.

[48]  Jiulun Fan,et al.  Efficient discriminative clustering via QR decomposition-based Linear Discriminant Analysis , 2018, Knowl. Based Syst..

[49]  Mahesh Chandra,et al.  Grid search analysis of nu-SVC for text-dependent speaker-identification , 2015, 2015 Annual IEEE India Conference (INDICON).

[50]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[51]  Marek R. Ogiela,et al.  Pattern recognition, clustering and classification applied to selected medical images , 2008 .

[52]  Wali Khan Mashwani,et al.  Hybrid non-dominated sorting genetic algorithm with adaptive operators selection , 2017, Appl. Soft Comput..

[53]  Mateusz Baran,et al.  Application of Computational Intelligence Methods for the Automated Identification of Paper-Ink Samples Based on LIBS , 2018, Sensors.

[54]  William Sanchez,et al.  Chemopreventive strategies in hepatocellular carcinoma , 2014, Nature Reviews Gastroenterology &Hepatology.

[55]  Resul Das,et al.  A comparison of multiple classification methods for diagnosis of Parkinson disease , 2010, Expert Syst. Appl..

[56]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[57]  Mahesh Pal,et al.  Random forest classifier for remote sensing classification , 2005 .

[58]  Krzysztof Rzecki,et al.  Approximation of Phenol Concentration Using Computational Intelligence Methods Based on Signals From the Metal-Oxide Sensor Array , 2015, IEEE Sensors Journal.

[59]  Kenneth Meijer,et al.  Activity identification using body-mounted sensors—a review of classification techniques , 2009, Physiological measurement.

[60]  Wojciech Maziarz,et al.  Classification of tea specimens using novel hybrid artificial intelligence methods , 2014 .

[61]  U. Rajendra Acharya,et al.  An efficient compression of ECG signals using deep convolutional autoencoders , 2018, Cognitive Systems Research.

[62]  Rizal Setya Perdana,et al.  Combining Likes-Retweet Analysis and Naive Bayes Classifier within Twitter for Sentiment Analysis , 2018 .

[63]  Xujuan Zhou,et al.  A new nested ensemble technique for automated diagnosis of breast cancer , 2020, Pattern Recognit. Lett..

[64]  Moloud Abdar,et al.  Impact of Patients’ Gender on Parkinson’s disease using Classification Algorithms , 2018 .

[65]  U. Rajendra Acharya,et al.  Arrhythmia detection using deep convolutional neural network with long duration ECG signals , 2018, Comput. Biol. Medicine.

[66]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[67]  Peng Zhang,et al.  The recognition of dissolved gas abnormality based on high dimensional support vector machine , 2017, 2017 IEEE Conference on Electrical Insulation and Dielectric Phenomenon (CEIDP).

[68]  Ganesh R. Naik,et al.  Single-Channel EMG Classification With Ensemble-Empirical-Mode-Decomposition-Based ICA for Diagnosing Neuromuscular Disorders , 2016, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[69]  Michal Niedzwiecki,et al.  Person recognition based on touch screen gestures using computational intelligence methods , 2017, Inf. Sci..

[70]  Özal Yildirim,et al.  A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification , 2018, Comput. Biol. Medicine.