How much data is needed to train a medical image deep learning system to achieve necessary high accuracy

The use of Convolutional Neural Networks (CNN) in natural image classification systems has produced very impressive results. Combined with the inherent nature of medical images that make them ideal for deep-learning, further application of such systems to medical image classification holds much promise. However, the usefulness and potential impact of such a system can be completely negated if it does not reach a target accuracy. In this paper, we present a study on determining the optimum size of the training data set necessary to achieve high classification accuracy with low variance in medical image classification systems. The CNN was applied to classify axial Computed Tomography (CT) images into six anatomical classes. We trained the CNN using six different sizes of training data set (5, 10, 20, 50, 100, and 200) and then tested the resulting system with a total of 6000 CT images. All images were acquired from the Massachusetts General Hospital (MGH) Picture Archiving and Communication System (PACS). Using this data, we employ the learning curve approach to predict classification accuracy at a given training sample size. Our research will present a general methodology for determining the training data set size necessary to achieve a certain target classification accuracy that can be easily applied to other problems within such systems.

[1]  John E. Dennis,et al.  An Adaptive Nonlinear Least-Squares Algorithm , 1977, TOMS.

[2]  E. Mizutani,et al.  Neuro-Fuzzy and Soft Computing-A Computational Approach to Learning and Machine Intelligence [Book Review] , 1997, IEEE Transactions on Automatic Control.

[3]  R. Simon,et al.  Sample size planning for developing classifiers using high-dimensional DNA microarray data. , 2007, Biostatistics.

[4]  Yingdong Zhao,et al.  How Large a Training Set is Needed to Develop a Classifier for Microarray Data? , 2008, Clinical Cancer Research.

[5]  K. Doi,et al.  Computer-aided diagnosis and artificial intelligence in clinical imaging. , 2011, Seminars in nuclear medicine.

[6]  Qing Zeng-Treitler,et al.  Predicting sample size required for classification performance , 2012, BMC Medical Informatics and Decision Making.

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  J. Popp,et al.  Sample size planning for classification models. , 2012, Analytica chimica acta.

[9]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[10]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[11]  Ronald M. Summers,et al.  Detection of Sclerotic Spine Metastases via Random Aggregation of Deep Convolutional Neural Network Classifications , 2014, ArXiv.

[12]  Danny Ziyi Chen,et al.  Neutrophils Identification by Deep Learning and Voronoi Diagram of Clusters , 2015, MICCAI.

[13]  Ronald M. Summers,et al.  Anatomy-specific classification of medical images using deep convolutional nets , 2015, 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI).

[14]  Shuiwang Ji,et al.  Deep convolutional neural networks for multi-modality isointense infant brain image segmentation , 2015, NeuroImage.

[15]  Jiang Liu,et al.  Using deep learning for robustness to parapapillary atrophy in optic disc segmentation , 2015, 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI).

[16]  Ronald M. Summers,et al.  Interleaved text/image Deep Mining on a large-scale radiology database , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Fang Zhang,et al.  Deep convolutional activation features for large scale Brain Tumor histopathology image classification and segmentation , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Hayit Greenspan,et al.  A comparative study for chest radiograph image retrieval using binary texture and deep learning classification , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[19]  V. Goh,et al.  Predicting Response to Neoadjuvant Chemotherapy with PET Imaging Using Convolutional Neural Networks , 2015, PloS one.

[20]  Wen-Huang Cheng,et al.  Computer-aided classification of lung nodules on computed tomography images via deep learning technique , 2015, OncoTargets and therapy.

[21]  Hayit Greenspan,et al.  Deep learning with non-medical training used for chest pathology identification , 2015, Medical Imaging.

[22]  Shu Liao,et al.  Bodypart Recognition Using Multi-stage Deep Learning , 2015, IPMI.

[23]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[24]  Ronald M. Summers,et al.  Improving Computer-Aided Detection Using Convolutional Neural Networks and Random View Aggregation , 2015, IEEE Transactions on Medical Imaging.

[25]  Xinyun Chen Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[26]  Nicholas Ayache,et al.  Fine-tuned convolutional neural nets for cardiac MRI acquisition plane recognition , 2017, Comput. methods Biomech. Biomed. Eng. Imaging Vis..

[27]  Christopher Joseph Pal,et al.  Brain tumor segmentation with Deep Neural Networks , 2015, Medical Image Anal..