Data Augmentation of Minority Class with Transfer Learning for Classification of Imbalanced Breast Cancer Dataset Using Inception-V3

In this paper, deep learning based experiments are conducted to investigate the effect of data augmentation on the minority class for the imbalanced breast cancer histopathology dataset (BREAKHIS). Two different pre-trained networks are fine-tuned with the minority-augmented dataset. The pre-trained networks were already trained on the well-known ImageNet dataset comprising of millions of high resolution images belonging to multiple object categories. The model so trained is further subjected to transfer learning, to correctly classify cancerous pattern from non-cancerous conditions, in a supervised manner. Our experiments were carried out in two phases. Phase-I investigates the effect of data augmentation applied on minority class for the Inception-v3 and ResNet-50 pre-trained networks. Results of phase-I are further enhanced in phase-II by the transfer learning approach in which features extracted from all layers of Inception-v3 are learnt by the SVM and weighted SVM classifiers. From experimental results, it was found that the pre-trained Inception-v3 model with data augmentation on minority class outperforms other network types. Results also indicate that Inception-v3 with data augmentation of minority class and transfer learning with weighted SVM gives the highest classification accuracies.

[1]  Chaoyang Zhang,et al.  Deep Learning Based Analysis of Histopathological Images of Breast Cancer , 2019, Front. Genet..

[2]  Hamid R. Tizhoosh,et al.  Convolutional neural networks for histopathology image classification: Training vs. Using pre-trained networks , 2017, 2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA).

[3]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[4]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[5]  Milad Barai,et al.  Impact of data augmentations when training the Inception model for image classification , 2017 .

[6]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[7]  David Dagan Feng,et al.  An Ensemble of Fine-Tuned Convolutional Neural Networks for Medical Image Classification , 2017, IEEE Journal of Biomedical and Health Informatics.

[8]  Seba Susan,et al.  Hybrid of Intelligent Minority Oversampling and PSO-Based Intelligent Majority Undersampling for Learning from Imbalanced Datasets , 2018, ISDA.

[9]  Gustavo Camps-Valls,et al.  Composite kernels for hyperspectral image classification , 2006, IEEE Geoscience and Remote Sensing Letters.

[10]  David Hughes,et al.  Deep Learning for Image-Based Cassava Disease Detection , 2017, Front. Plant Sci..

[11]  Yurong Liu,et al.  A survey of deep neural network architectures and their applications , 2017, Neurocomputing.

[12]  Wesley De Neve,et al.  Towards novel methods for effective transfer learning and unsupervised deep learning for medical image analysis , 2017 .

[13]  Taeho Jo,et al.  A Multiple Resampling Method for Learning from Imbalanced Data Sets , 2004, Comput. Intell..

[14]  Ching Y. Suen,et al.  A novel hybrid CNN-SVM classifier for recognizing handwritten digits , 2012, Pattern Recognit..

[15]  A. Jemal,et al.  Global Cancer Statistics , 2011 .

[16]  Nitesh V. Chawla,et al.  Data Mining for Imbalanced Datasets: An Overview , 2005, The Data Mining and Knowledge Discovery Handbook.

[17]  Seba Susan,et al.  SSOMaj-SMOTE-SSOMin: Three-step intelligent pruning of majority and minority samples for learning from imbalanced datasets , 2019, Appl. Soft Comput..

[18]  Son Lam Phung,et al.  Learning Pattern Classification Tasks with Imbalanced Data Sets , 2009 .

[19]  J. Ferlay,et al.  Global Cancer Statistics, 2002 , 2005, CA: a cancer journal for clinicians.