Using ensemble classification methods in lung cancer disease*

This paper presents an overview of the use of ensemble classification methods in the lung cancer disease. An analysis is carried out according to seven aspects: publication trends, channels and venues; medical tasks tackled; ensemble types proposed; single techniques used to construct the ensemble methods; rules used to draw the output of the ensemble; datasets used to build and evaluate the ensemble methods; and tools used. The application of ensemble methods in lung cancer disease started in 2003. The diagnosis task was the most tackled one by researchers. Furthermore, the homogeneous ensembles were the most frequent in the literature, and decision tree techniques were the most adopted ones for constructing ensembles. Several datasets related to the lung cancer disease were used to build and assess the ensemble methods. The most used tool was Weka. To conclude, some recommendations for future research are: tackle the medical tasks not investigated in the literature by means of ensemble methods; investigate other classification methods; propose other heterogeneous ensemble methods; and use other combination rules.

[1]  Igor Jurisica,et al.  Data mining for case-based reasoning in high-dimensional biological domains , 2005, IEEE Transactions on Knowledge and Data Engineering.

[2]  Ali Idri,et al.  Software Development Effort Estimation Using Feature Selection Techniques , 2018, New Trends in Software Methodologies, Tools and Techniques.

[3]  Michael Bauer,et al.  Health Outcome Prediction with Multiple Models and Dempster-Shafer Theory , 2015, 2015 International Conference on Computational Science and Computational Intelligence (CSCI).

[4]  D. Ruta,et al.  An Overview of Classifier Fusion Methods , 2000 .

[5]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[6]  Ali Idri,et al.  Systematic Mapping Study of Ensemble Effort Estimation , 2016, ENASE.

[7]  Xueyan Mei,et al.  Predicting five-year overall survival in patients with non-small cell lung cancer by reliefF algorithm and random forests , 2017, 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC).

[8]  Y. Alp Aslandogan,et al.  Evidence combination in medical data mining , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[9]  Amit Kumar,et al.  A Hybrid Predictive Model Integrating C4.5 and Decision Table Classifiers for Medical Data Sets , 2018, J. Inf. Technol. Res..

[10]  Zhen Liu,et al.  A hybrid method based on ensemble WELM for handling multi class imbalance in cancer microarray data , 2017, Neurocomputing.

[11]  Ali Idri,et al.  Knowledge discovery in cardiology: A systematic literature review , 2017, Int. J. Medical Informatics.

[12]  P. Lambin,et al.  Exploratory Study to Identify Radiomics Classifiers for Lung Cancer Histology , 2016, Front. Oncol..

[13]  Suphakant Phimoltares,et al.  Diagnosis of Heart Disease Using a Mixed Classifier , 2017, 2017 21st International Computer Science and Engineering Conference (ICSEC).

[14]  Lalith Polepeddi,et al.  Colon cancer survival prediction using ensemble data mining on SEER data , 2013, 2013 IEEE International Conference on Big Data.

[15]  Tianzi Jiang,et al.  A combinational feature selection and ensemble neural network method for classification of gene expression data , 2004, BMC Bioinformatics.

[16]  Fai Wong,et al.  Ensemble learning on heartbeat type classification , 2011, Proceedings 2011 International Conference on System Science and Engineering.

[17]  James A. Bartholomai,et al.  Prediction of lung cancer patient survival via supervised machine learning classification techniques , 2017, Int. J. Medical Informatics.

[18]  B. Krawczyk,et al.  Ensemble fusion methods for medical data classification , 2012, 11th Symposium on Neural Network Applications in Electrical Engineering.

[19]  Jacob D. Furst,et al.  Building an Ensemble of Probabilistic Classifiers for Lung Nodule Interpretation , 2011, 2011 10th International Conference on Machine Learning and Applications and Workshops.

[20]  Pearl Brereton,et al.  Performing systematic literature reviews in software engineering , 2006, ICSE.

[21]  Hua Wang,et al.  Robustness analysis of diversified ensemble decision tree algorithms for Microarray data classification , 2008, 2008 International Conference on Machine Learning and Cybernetics.

[22]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[23]  Alain Abran,et al.  Systematic literature review of ensemble effort estimation , 2016, J. Syst. Softw..

[24]  Alain Abran,et al.  Improved estimation of software development effort using Classical and Fuzzy Analogy ensembles , 2016, Appl. Soft Comput..

[25]  Issam El-Naqa,et al.  Application of Machine Learning Techniques for Prediction of Radiation Pneumonitis in Lung Cancer Patients , 2009, 2009 International Conference on Machine Learning and Applications.

[26]  Anirban Mukherjee,et al.  Cancer Classification from Gene Expression Data by NPPC Ensemble , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[27]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[28]  P. Chongstitvatana,et al.  A Genetic Programming Ensemble Approach to Cancer Microarray Data Classification , 2008, 2008 3rd International Conference on Innovative Computing Information and Control.

[29]  Ali Idri,et al.  A systematic map of data analytics in breast cancer , 2018, ACSW.

[30]  Enes Celik,et al.  The mesothelioma disease diagnosis with artificial intelligence methods , 2016, 2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT).

[31]  R. Anitha,et al.  Ensemble based optimal classification model for pre-diagnosis of lung cancer , 2013, 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT).

[32]  Alain Abran,et al.  Evaluating filter fuzzy analogy homogenous ensembles for software development effort estimation , 2018, J. Softw. Evol. Process..

[33]  I. Gondal,et al.  Stacked regression ensemble for cancer class prediction , 2005, INDIN '05. 2005 3rd IEEE International Conference on Industrial Informatics, 2005..

[34]  Nilesh V. Patel,et al.  A comprehensive search for expert classification methods in disease diagnosis and prediction , 2018, Expert Syst. J. Knowl. Eng..

[35]  Deepa Abin,et al.  An ensemble approach for cancerious dataset analysis using feature selection , 2015, 2015 Global Conference on Communication Technologies (GCCT).

[36]  Wenhuang Liu,et al.  Dynamic Weighting Ensembles for Incremental Learning , 2009, 2009 Chinese Conference on Pattern Recognition.

[37]  Lyle Ungar,et al.  Using machine learning to predict radiation pneumonitis in patients with stage I non-small cell lung cancer treated with stereotactic body radiation therapy , 2016, Physics in medicine and biology.

[38]  Amir-Masoud Eftekhari-Moghadam,et al.  Knowledge discovery in medicine: Current issue and future trend , 2014, Expert Syst. Appl..

[39]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Jing Li,et al.  A Comparative Study on Machine Classification Model in Lung Cancer Cases Analysis , 2016 .

[41]  Abbas Z. Kouzani,et al.  Lung nodules detection by ensemble classification , 2008, 2008 IEEE International Conference on Systems, Man and Cybernetics.

[42]  K. Usha Rani,et al.  ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA , 2012 .

[43]  Aik Choon Tan,et al.  Ensemble machine learning on gene expression data for cancer classification. , 2003, Applied bioinformatics.

[44]  Yong Hu,et al.  Systematic literature review of machine learning based software development effort estimation models , 2012, Inf. Softw. Technol..

[45]  Joseph O. Deasy,et al.  Decision Fusion of Machine Learning Models to Predict Radiotherapy-Induced Lung Pneumonitis , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[46]  Ali Idri,et al.  Systematic mapping study of data mining–based empirical studies in cardiology , 2019, Health Informatics J..

[47]  Yanqing Zhang,et al.  Fuzzy support vector machines for biomedical data analysis , 2005, 2005 IEEE International Conference on Granular Computing.

[48]  C. N. Brorn,et al.  WHO? , 1896 .

[49]  Jacob D. Furst,et al.  Weak Segmentations and Ensemble Learning to Predict Semantic Ratings of Lung Nodules , 2013, 2013 12th International Conference on Machine Learning and Applications.

[50]  A. Akan,et al.  A novel approach to malignant-benign classification of pulmonary nodules by using ensemble learning classifiers , 2014, 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[51]  Bartosz Krawczyk,et al.  On optimal settings of classification tree ensembles for medical decision support , 2013, Health Informatics J..

[52]  Reza Javidan,et al.  Predicting lung cancer survivability using ensemble learning methods , 2017, 2017 Intelligent Systems Conference (IntelliSys).

[53]  Weidong Xu,et al.  Study on the Infectious Regularity of Patients with Advanced Lung Cancer , 2016, 2016 8th International Conference on Information Technology in Medicine and Education (ITME).

[54]  A. Bezerianos,et al.  An Ensemble Approach for Phenotype Classification Based on Fuzzy Partitioning of Gene Expression Data , 2006, 2006 International Conference of the IEEE Engineering in Medicine and Biology Society.