Conditional dependence tests reveal the usage of ABCD rule features and bias variables in automatic skin lesion classification

Skin cancer is the most common form of cancer, and melanoma is the leading cause of cancer related deaths. To improve the chances of survival, early detection of melanoma is crucial. Automated systems for classifying skin lesions can assist with initial analysis. However, if we expect people to entrust their well-being to an automatic classification algorithm, it is important to ensure that the algorithm makes medically sound decisions. We investigate this question by testing whether two state-of-the-art models use the features defined in the dermoscopic ABCD rule or whether they rely on biases. We use a method that frames supervised learning as a structural causal model, thus reducing the question whether a feature is used to a conditional dependence test. We show that this conditional dependence method yields meaningful results on data from the ISIC archive. Furthermore, we find that the selected models incorporate asymmetry, border and dermoscopic structures in their decisions but not color. Finally, we show that the same classifiers also use bias features such as the patient’s age, skin color or the existence of colorful patches.

[1]  O. Penrose The Direction of Time , 1962 .

[2]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Max Welling,et al.  Visualizing Deep Neural Network Decisions: Prediction Difference Analysis , 2017, ICLR.

[4]  M. Emre Celebi,et al.  An Overview of Melanoma Detection in Dermoscopy Images Using Image Processing and Machine Learning , 2016, ArXiv.

[5]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[6]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[7]  Susan M Swetter,et al.  Screening, early detection, and trends for melanoma: current status (2000-2006) and future directions. , 2007, Journal of the American Academy of Dermatology.

[8]  Randy H. Moss,et al.  A methodological approach to the classification of dermoscopy images , 2007, Comput. Medical Imaging Graph..

[9]  Achim Hekler,et al.  Deep learning outperformed 136 of 157 dermatologists in a head-to-head dermoscopic melanoma image classification task. , 2019, European journal of cancer.

[10]  Joachim Denzler,et al.  Using Causal Inference to Globally Understand Black Box Predictors Beyond Saliency Maps , 2020 .

[11]  Ramprasaath R. Selvaraju,et al.  Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization , 2016 .

[12]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[13]  Eduardo Valle,et al.  Data Augmentation for Skin Lesion Analysis , 2018, OR 2.0/CARE/CLIP/ISIC@MICCAI.

[14]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[15]  Joachim Denzler,et al.  Towards Learning an Unbiased Classifier from Biased Data via Conditional Adversarial Debiasing , 2021, ArXiv.

[16]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[17]  Dumitru Erhan,et al.  The (Un)reliability of saliency methods , 2017, Explainable AI.

[18]  Boris Polyak Some methods of speeding up the convergence of iteration methods , 1964 .

[19]  Been Kim,et al.  Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values , 2018, ICLR.

[20]  Marianne Berwick,et al.  The study of nevi in children: Principles learned and implications for melanoma diagnosis. , 2016, Journal of the American Academy of Dermatology.

[21]  Konda Reddy Mopuri,et al.  CNN Fixations: An Unraveling Approach to Visualize the Discriminative Image Regions , 2019, IEEE Transactions on Image Processing.

[22]  Marcel Simon,et al.  Neural Activation Constellations: Unsupervised Part Model Discovery with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  W. Stolz,et al.  The ABCD rule of dermatoscopy. High prospective value in the diagnosis of doubtful melanocytic skin lesions. , 1994, Journal of the American Academy of Dermatology.

[24]  Reda Kasmi,et al.  Classification of malignant melanoma and benign skin lesions: implementation of automatic ABCD rule , 2016, IET Image Process..

[25]  Joachim Denzler,et al.  Determining the Relevance of Features for Deep Neural Networks , 2020, ECCV.

[26]  Abubakar Abid,et al.  Interpretation of Neural Networks is Fragile , 2017, AAAI.

[27]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[28]  Noel C. F. Codella,et al.  Skin lesion analysis toward melanoma detection: A challenge at the 2017 International symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC) , 2016, 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018).

[29]  Alexander Binder,et al.  Unmasking Clever Hans predictors and assessing what machines really learn , 2019, Nature Communications.

[30]  M. Emre Celebi,et al.  Dermoscopy Image Analysis: Overview and Future Directions , 2019, IEEE Journal of Biomedical and Health Informatics.

[31]  Sherin Muckatira,et al.  Properties Of Winning Tickets On Skin Lesion Classification , 2020, ArXiv.

[32]  Pietro Perona,et al.  Fast Conditional Independence Test for Vector Variables with Large Sample Sizes , 2018, ArXiv.

[33]  R. Hofmann-Wellenhof,et al.  Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. , 2019, The Lancet. Oncology.

[34]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[35]  Chandan Singh,et al.  Interpretations are useful: penalizing explanations to align neural networks with prior knowledge , 2019, ICML.

[36]  Sally Shrapnel,et al.  Deep neural network or dermatologist? , 2019, iMIMIC/ML-CDS@MICCAI.

[37]  Harald Kittler,et al.  Descriptor : The HAM 10000 dataset , a large collection of multi-source dermatoscopic images of common pigmented skin lesions , 2018 .

[38]  Le Song,et al.  A Kernel Statistical Test of Independence , 2007, NIPS.

[39]  Christian Requena-Mesa,et al.  Deep Learning – an Opportunity and a Challenge for Geo- and Astrophysics , 2020 .

[40]  Noel C. F. Codella,et al.  Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC) , 2019, ArXiv.

[41]  Eduardo Valle,et al.  (De) Constructing Bias on Skin Lesion Datasets , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[42]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[43]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Nils Gessert,et al.  Skin lesion classification using ensembles of multi-resolution EfficientNets with meta data , 2019, MethodsX.