The semantic segmentation approach for normal and pathologic tympanic membrane using deep learning

Objectives The purpose of this study was to create a deep learning model for the detection and segmentation of major structures of the tympanic membrane. Methods Total 920 tympanic endoscopic images had been stored were obtained, retrospectively. We constructed a detection and segmentation model using Mask R-CNN with ResNet-50 backbone targeting three clinically meaningful structures: (1) tympanic membrane (TM); (2) malleus with side of tympanic membrane; and (3) suspected perforation area. The images were randomly divided into three sets – taining set, validation set, and test set – at a ratio of 0.6:0.2:0.2, resulting in 548, 187, and 185 images, respectively. After assignment, 548 tympanic membrane images were augmented 50 times each, reaching 27,400 images. Results At the most optimized point of the model, it achieved a mean average precision of 92.9% on test set. When an intersection over Union (IoU) score of greater than 0.5 was used as the reference point, the tympanic membrane was 100% detectable, the accuracy of side of the tympanic membrane based on the malleus segmentation was 88.6% and detection accuracy of suspicious perforation was 91.4%. Conclusions Anatomical segmentation may allow the inclusion of an explanation provided by deep learning as part of the results. This method is applicable not only to tympanic endoscope, but also to sinus endoscope, laryngoscope, and stroboscope. Finally, it will be the starting point for the development of automated medical records descriptor of endoscope images.

[1]  Hermanus C. Myburgh,et al.  Otitis Media Diagnosis for Developing Countries Using Tympanic Membrane Image-Analysis , 2016, EBioMedicine.

[2]  Jian Sun,et al.  Instance-Aware Semantic Segmentation via Multi-task Network Cascades , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Muhammad Shazam Kasher Otitis Media Analysis - An Automated Feature Extraction and Image Classification System , 2018 .

[4]  M. Pichichero,et al.  Comparison of performance by otolaryngologists, pediatricians, and general practioners on an otoendoscopic diagnostic video examination. , 2005, International journal of pediatric otorhinolaryngology.

[5]  Jelena Kovacevic,et al.  Automated Diagnosis of Otitis Media: Vocabulary and Grammar , 2013, Int. J. Biomed. Imaging.

[6]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[7]  Geraint Rees,et al.  Clinically applicable deep learning for diagnosis and referral in retinal disease , 2018, Nature Medicine.

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  D. Shen,et al.  Computer-Aided Diagnosis with Deep Learning Architecture: Applications to Breast Lesions in US Images and Pulmonary Nodules in CT Scans , 2016, Scientific Reports.

[10]  Metin Nafi Gürcan,et al.  Detection of eardrum abnormalities using ensemble deep learning approaches , 2018, Medical Imaging.

[11]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.