Association Between Surgical Skin Markings in Dermoscopic Images and Diagnostic Performance of a Deep Learning Convolutional Neural Network for Melanoma Recognition.

Importance Deep learning convolutional neural networks (CNNs) have shown a performance at the level of dermatologists in the diagnosis of melanoma. Accordingly, further exploring the potential limitations of CNN technology before broadly applying it is of special interest. Objective To investigate the association between gentian violet surgical skin markings in dermoscopic images and the diagnostic performance of a CNN approved for use as a medical device in the European market. Design and Setting A cross-sectional analysis was conducted from August 1, 2018, to November 30, 2018, using a CNN architecture trained with more than 120 000 dermoscopic images of skin neoplasms and corresponding diagnoses. The association of gentian violet skin markings in dermoscopic images with the performance of the CNN was investigated in 3 image sets of 130 melanocytic lesions each (107 benign nevi, 23 melanomas). Exposures The same lesions were sequentially imaged with and without the application of a gentian violet surgical skin marker and then evaluated by the CNN for their probability of being a melanoma. In addition, the markings were removed by manually cropping the dermoscopic images to focus on the melanocytic lesion. Main Outcomes and Measures Sensitivity, specificity, and area under the curve (AUC) of the receiver operating characteristic (ROC) curve for the CNN's diagnostic classification in unmarked, marked, and cropped images. Results In all, 130 melanocytic lesions (107 benign nevi and 23 melanomas) were imaged. In unmarked lesions, the CNN achieved a sensitivity of 95.7% (95% CI, 79%-99.2%) and a specificity of 84.1% (95% CI, 76.0%-89.8%). The ROC AUC was 0.969. In marked lesions, an increase in melanoma probability scores was observed that resulted in a sensitivity of 100% (95% CI, 85.7%-100%) and a significantly reduced specificity of 45.8% (95% CI, 36.7%-55.2%, P < .001). The ROC AUC was 0.922. Cropping images led to the highest sensitivity of 100% (95% CI, 85.7%-100%), specificity of 97.2% (95% CI, 92.1%-99.0%), and ROC AUC of 0.993. Heat maps created by vanilla gradient descent backpropagation indicated that the blue markings were associated with the increased false-positive rate. Conclusions and Relevance This study's findings suggest that skin markings significantly interfered with the CNN's correct diagnosis of nevi by increasing the melanoma probability scores and consequently the false-positive rate. A predominance of skin markings in melanoma training images may have induced the CNN's association of markings with a melanoma diagnosis. Accordingly, these findings suggest that skin markings should be avoided in dermoscopic images intended for analysis by a CNN. Trial Registration German Clinical Trial Register (DRKS) Identifier: DRKS00013570.

[1]  K Wolff,et al.  In vivo epiluminescence microscopy of pigmented skin lesions. I. Pattern analysis of pigmented skin lesions. , 1987, Journal of the American Academy of Dermatology.

[2]  S. Menzies,et al.  Frequency and morphologic characteristics of invasive melanomas lacking specific surface microscopic features. , 1996, Archives of dermatology.

[3]  G. Argenziano,et al.  Epiluminescence microscopy for the diagnosis of doubtful melanocytic skin lesions. Comparison of the ABCD rule of dermatoscopy and a new 7-point checklist based on pattern analysis. , 1998, Archives of dermatology.

[4]  P. Aegerter,et al.  Is dermoscopy (epiluminescence microscopy) useful for the diagnosis of melanoma? Results of a meta-analysis using techniques adapted to the evaluation of diagnostic tests. , 2001, Archives of dermatology.

[5]  H. Kittler,et al.  Diagnostic accuracy of dermoscopy. , 2002, The Lancet. Oncology.

[6]  Wolzt,et al.  World Medical Association Declaration of Helsinki: ethical principles for medical research involving human subjects. , 2003, The Journal of the American College of Dentists.

[7]  Christiane,et al.  World Medical Association Declaration of Helsinki: ethical principles for medical research involving human subjects. , 2004, Journal international de bioethique = International journal of bioethics.

[8]  S. Menzies,et al.  Dermoscopy compared with naked eye examination for the diagnosis of primary melanoma: a meta‐analysis of studies performed in a clinical setting , 2008, The British journal of dermatology.

[9]  Ken Kobayashi,et al.  Accuracy in melanoma detection: a 10-year multicenter survey. , 2012, Journal of the American Academy of Dermatology.

[10]  Christiane,et al.  World Medical Association Declaration of Helsinki: ethical principles for medical research involving human subjects. , 2013, JAMA.

[11]  J. Coebergh,et al.  Trends in incidence and predictions of cutaneous melanoma across Europe up to 2015 , 2014, Journal of the European Academy of Dermatology and Venereology : JEADV.

[12]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[13]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Mohammad H. Jafari,et al.  Skin lesion segmentation in clinical images using deep learning , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[15]  Mohammed Nabhan,et al.  Melanoma screening: A plan for improving early detection , 2016, Annals of medicine.

[16]  Gerald Schaefer,et al.  Simple and effective pre-processing for automated melanoma discrimination based on cytological findings , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[17]  Jim X. Xiang On two-sample McNemar test , 2016, Journal of Biopharmaceutical Statistics.

[18]  M. Emre Celebi,et al.  An Overview of Melanoma Detection in Dermoscopy Images Using Image Processing and Machine Learning , 2016, ArXiv.

[19]  Alexander Binder,et al.  Comparison of deep learning architectures for H&E histopathology images , 2017, 2017 IEEE Conference on Big Data and Analytics (ICBDA).

[20]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[21]  H. Haenssle,et al.  Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists , 2018, Annals of oncology : official journal of the European Society for Medical Oncology.

[22]  S. Han,et al.  Classification of the Clinical Images for Benign and Malignant Cutaneous Tumors Using a Deep Learning Algorithm. , 2018, The Journal of investigative dermatology.

[23]  Niladri B. Puhan,et al.  Recent Deep Learning Methods for Melanoma Detection: A Review , 2018, ICMC.

[24]  Richard K. G. Do,et al.  Convolutional neural networks: an overview and application in radiology , 2018, Insights into Imaging.

[25]  A. Kalloo,et al.  Results of the 2016 International Skin Imaging Collaboration International Symposium on Biomedical Imaging challenge: Comparison of the accuracy of computer algorithms to dermatologists for the diagnosis of melanoma from dermoscopic images , 2018, Journal of the American Academy of Dermatology.

[26]  Achim Hekler,et al.  Skin Cancer Classification Using Convolutional Neural Networks: Systematic Review , 2018, Journal of medical Internet research.

[27]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[28]  Mehmet Türkan,et al.  A survey on automated melanoma detection , 2018, Eng. Appl. Artif. Intell..

[29]  Majid Razmara,et al.  Diagnostic accuracy of content‐based dermatoscopic image retrieval with deep classification features† , 2018, The British journal of dermatology.

[30]  Xueli Du,et al.  Application of artificial intelligence in ophthalmology. , 2018, International journal of ophthalmology.

[31]  Julie Ann A. Salido,et al.  Using Deep Learning to Detect Melanoma in Dermoscopy Images , 2022 .