Stacking Ensemble Technique for Classifying Breast Cancer

Objectives Breast cancer is the second most common cancer among Korean women. Because breast cancer is strongly associated with negative emotional and physical changes, early detection and treatment of breast cancer are very important. As a supporting tool for classifying breast cancer, we tried to identify the best meta-learner model in a stacking ensemble when the same machine learning models for the base learner and meta-learner are used. Methods We used machine learning models, such as the gradient boosted model, distributed random forest, generalized linear model, and deep neural network in a stacking ensemble. These models were used to construct a base learner, and each of them was used as a meta-learner again. Then, we compared the performance of machine learning models in the meta-learner to determine the best meta-learner model in the stacking ensemble. Results Experimental results showed that using the GBM as a meta-learner led to higher accuracy than that achieved with any other model for breast cancer data and using the GLM as a meta learner led to low root-mean-squared error for both sets of breast cancer data. Conclusions We compared the performance of every meta-learner model in a stacking ensemble as a supporting tool for classifying breast cancer. The study showed that using specific models as a metalearner resulted in better performance than single classifiers, and using GBM and GLM as a meta-learner is appropriate as a supporting tool for classifying breast cancer data.

[1]  B. Glasgow,et al.  Fine-needle aspiration in the management of breast masses. , 1989, Pathology annual.

[2]  LimDongHoon,et al.  Bagging Support Vector Machine for Improving Breast Cancer Classification , 2014 .

[3]  M. J. van der Laan,et al.  Statistical Applications in Genetics and Molecular Biology Super Learner , 2010 .

[4]  Mikel Galar,et al.  Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy , 2016, Appl. Soft Comput..

[5]  Dhruba Kumar Bhattacharyya,et al.  Classification of microarray cancer data using ensemble approach , 2013, Network Modeling Analysis in Health Informatics and Bioinformatics.

[6]  Chih-Fong Tsai,et al.  SVM and SVM Ensembles in Breast Cancer Prediction , 2017, PloS one.

[7]  S. Hong Fine Needle Aspiration Cytology of Thyroid Follicular Proliferative Lesions , 2008 .

[8]  C. Kim,et al.  Diagnostic Value of Ultrasound-guided Fine Needle Aspiration Cytology by a Endocrine Surgeon , 2008 .

[9]  [Factors Influencing Posttraumatic Growth in Survivors of Breast Cancer]. , 2016, Journal of Korean Academy of Nursing.

[10]  Seok Jin Nam,et al.  Screening and Diagnosis for Breast Cancers , 2009 .

[11]  A. Beigzadeh,et al.  Machine learning models in breast cancer survival prediction. , 2016, Technology and health care : official journal of the European Society for Engineering and Medicine.

[12]  김인철,et al.  한국인 여성 유방암 치료방법의 최근 동향 , 1991 .

[13]  Zhenyu Wang,et al.  Design Ensemble Machine Learning Model for Breast Cancer Diagnosis , 2012, Journal of Medical Systems.

[14]  한성호,et al.  Factors Influencing Anxiety and Depression in Breast Cancer Patients Treated with Surgery , 2011 .