A Comparative Study for Classification of Skin Cancer

Skin cancer is one of the most common types of cancer all over the world. It is easily treatable when it is detected in its beginning stage. Melanoma is the most dangerous form of skin cancer. Early detection of melanoma is important in reducing the mortality rate of skin cancer. Recently, machine learning has become an efficient method in classifying skin lesions as melanoma or benign. Main features for this task include color, texture and shape. A comparative study about color, texture and shape features of melanoma is useful for future research of skin cancer classification. Inspired by this fact, our study compares the classification results of 6 classifiers in combination with 7 feature extraction methods and 4 data preprocessing steps on the two largest datasets of skin cancer. Our findings reveal that a system consisting of Linear Normalization of the input image as data preprocessing step, HSV as feature extraction method and Balanced Random Forest as classifier yields best prediction results on the HAM10000 dataset with 81.46% AUC, 74.75% accuracy, 90.09% sensitivity and 72.84 % specificity.

[1]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[3]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[4]  A. Jemal,et al.  Cancer statistics, 2018 , 2018, CA: a cancer journal for clinicians.

[5]  Randy H. Moss,et al.  A methodological approach to the classification of dermoscopy images , 2007, Comput. Medical Imaging Graph..

[6]  Jorge S. Marques,et al.  Two Systems for the Detection of Melanomas in Dermoscopy Images Using Texture and Color Features , 2014, IEEE Systems Journal.

[7]  Ji-quan Ma,et al.  Content-Based Image Retrieval with HSV Color Space and Texture Features , 2009, 2009 International Conference on Web Information Systems and Mining.

[8]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[9]  Noel C. F. Codella,et al.  Skin lesion analysis toward melanoma detection: A challenge at the 2017 International symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC) , 2016, 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018).

[10]  Harald Kittler,et al.  Descriptor : The HAM 10000 dataset , a large collection of multi-source dermatoscopic images of common pigmented skin lesions , 2018 .

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[13]  Susan M Swetter,et al.  Screening, early detection, and trends for melanoma: current status (2000-2006) and future directions. , 2007, Journal of the American Academy of Dermatology.

[14]  Taghi M. Khoshgoftaar,et al.  An Empirical Study of Learning from Imbalanced Data Using Random Forest , 2007 .

[15]  H. Kashima,et al.  Roughly balanced bagging for imbalanced data , 2009 .

[16]  Zhenhua Guo,et al.  A Completed Modeling of Local Binary Pattern Operator for Texture Classification , 2010, IEEE Transactions on Image Processing.