Automatic diagnosis of breast cancer is a challenge that promises more accessible healthcare. In this paper, we describe the process of predicting slide-level cancer metastasis with machine learning techniques. First, a whole slide image is split into smaller patches which are classified for cancer by a model based on DenseNet, a Deep Neural Network with established performance. Next, the patch-level results are aggregated into a confidence map, which then goes through DBSCAN, a clustering algorithm, to reveal morphological features of cancerous regions. Finally, the minimal number of slides with the highest representative power is selected through independent repetitions of train-validation cycles with XGBoost. The resulting slide-level prediction from the ensembled XGBoost determine the pN stages of individual patients.
[1]
Tianqi Chen,et al.
XGBoost: A Scalable Tree Boosting System
,
2016,
KDD.
[2]
Hans-Peter Kriegel,et al.
A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise
,
1996,
KDD.
[3]
Geoffrey E. Hinton,et al.
Dynamic Routing Between Capsules
,
2017,
NIPS.
[4]
Dayong Wang,et al.
Deep Learning for Identifying Metastatic Breast Cancer
,
2016,
ArXiv.
[5]
Kilian Q. Weinberger,et al.
Densely Connected Convolutional Networks
,
2016,
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).