Diagnosis with Confidence: Deep Learning for Reliable Classification of Squamous Lesions of the Upper Aerodigestive Tract

Background Diagnosis of head and neck (HN) squamous dysplasias and carcinomas is critical for patient care cure and follow-up. It can be challenging, especially for grading intraepithelial lesions. Despite recent simplification in the last WHO grading system, the inter- and intra-observer variability remains substantial, particularly for non-specialized pathologists, exhibiting the need for new tools to support pathologists. Methods In this study we investigated the potential of deep learning to assist the pathologist with automatic and reliable classification of HN lesions following the 2022 WHO classification system. We created, for the first time, a large-scale database of histological samples (>2000 slides) intended for developing an automatic diagnostic tool. We developed and trained a weakly supervised model performing classification from whole slide images (WSI). We evaluated our model on both internal and external test sets and we defined and validated a new confidence score to assess the predictions which can be used to identify difficult cases. Results Our model demonstrated high classification accuracy across all lesion types on both internal and external test sets (respectively average AUC: 0.878 (95% CI:[0.834-0.918]) and 0.886 (95% CI: [0.813-0.947])) and the confidence score allowed for accurate differentiation between reliable and uncertain predictions. Conclusions Our results demonstrate that the model, associated with confidence measurements, can help in the difficult task of classifying head and neck squamous lesions by limiting variability and detecting ambiguous cases, taking us one step closer to a wider adoption of AI-based assistive tools.

[1]  C. Badoual,et al.  Simple and Efficient Confidence Score for Grading Whole Slide Images , 2023, ArXiv.

[2]  A. Pearson,et al.  Uncertainty-informed deep learning models enable high-confidence predictions for digital histopathology , 2022, Nature Communications.

[3]  N. Gale,et al.  Update from the 5th Edition of the World Health Organization Classification of Head and Neck Tumors: Hypopharynx, Larynx, Trachea and Parapharyngeal Space , 2022, Head and Neck Pathology.

[4]  Nesrine Benyahia,et al.  Le premier data challenge organisé par la Société Française de Pathologie : une compétition internationale en 2020, un outil de recherche en intelligence artificielle pour l’avenir ? , 2022, Annales de Pathologie.

[5]  C. Lundström,et al.  Generalisation effects of predictive uncertainty estimation in deep learning for digital pathology , 2021, Scientific Reports.

[6]  P. Sloan,et al.  The clinical utility of contemporary oral epithelial dysplasia grading systems. , 2021, Journal of oral pathology & medicine : official publication of the International Association of Oral Pathologists and the American Academy of Oral Pathology.

[7]  Saeid Nahavandi,et al.  MCUa: Multi-Level Context and Uncertainty Aware Dynamic Deep Ensemble for Breast Cancer Histology Image Classification , 2021, IEEE Transactions on Biomedical Engineering.

[8]  Cheng-Chang Chang,et al.  Artificial intelligence-assisted fast screening cervical high grade squamous intraepithelial lesion and squamous cell carcinoma diagnosis and treatment planning , 2021, Scientific Reports.

[9]  Saima Tabassum,et al.  Premalignant Conditions of Larynx , 2021 .

[10]  M. Shaban,et al.  Artificial Intelligence-based methods in head and neck cancer diagnosis: an overview , 2021, British Journal of Cancer.

[11]  Linhong Wang,et al.  Artificial intelligence-assisted cytology for detection of cervical intraepithelial neoplasia or invasive cancer: A multicenter, clinical-based, observational study. , 2020, Gynecologic oncology.

[12]  Pierre Courtiol,et al.  A deep learning model to predict RNA-Seq expression of tumours from whole slide images , 2020, Nature Communications.

[13]  N. Coudray,et al.  Deep learning links histology, molecular signatures and prognosis in cancer , 2020, Nature Cancer.

[14]  M. Shaban,et al.  Use of artificial intelligence in diagnosis of head and neck precancerous and cancerous lesions: A systematic review. , 2020, Oral oncology.

[15]  Ming Y. Lu,et al.  AI-based pathology predicts origins for cancers of unknown primary , 2020, Nature.

[16]  Leo Grady,et al.  Novel artificial intelligence system increases the detection of prostate cancer in whole slide images of core needle biopsies , 2020, Modern Pathology.

[17]  A. Mäkitie,et al.  Developing Classifications of Laryngeal Dysplasia: The Historical Basis , 2020, Advances in Therapy.

[18]  Martin Eklund,et al.  The PANDA challenge: Prostate cANcer graDe Assessment using the Gleason grading system , 2020 .

[19]  N. Gale,et al.  Laryngeal Dysplasia: Persisting Dilemmas, Disagreements and Unsolved Problems—A Short Review , 2020, Head and Neck Pathology.

[20]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[21]  R. Fisher,et al.  Head and Neck Squamous Cell Carcinoma , 2020, Definitions.

[22]  B. van Ginneken,et al.  Automated deep-learning system for Gleason grading of prostate cancer using biopsies: a diagnostic study. , 2020, The Lancet. Oncology.

[23]  G. Wainrib,et al.  Deep learning-based classification of mesothelioma improves prediction of patient outcome , 2019, Nature Medicine.

[24]  A. Madabhushi,et al.  Artificial intelligence in digital pathology — new tools for diagnosis and precision oncology , 2019, Nature Reviews Clinical Oncology.

[25]  Christian S. Perone,et al.  Deep Active Learning for Axon-Myelin Segmentation on Histology Data , 2019, ArXiv.

[26]  P. Shueng,et al.  Health-related quality of life and utility in head and neck cancer survivors , 2019, BMC Cancer.

[27]  Saeed Hassanpour,et al.  Finding a Needle in the Haystack: Attention-Based Classification of High Resolution Microscopy Images , 2018, ArXiv.

[28]  David F. Steiner,et al.  Impact of Deep Learning Assistance on the Histopathologic Review of Lymph Nodes for Metastatic Breast Cancer , 2018, The American journal of surgical pathology.

[29]  S. Möller,et al.  Laryngeal precursor lesions: Interrater and intrarater reliability of histopathological assessment , 2018, The Laryngoscope.

[30]  N. Razavian,et al.  Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning , 2018, Nature Medicine.

[31]  A. Jemal,et al.  Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries , 2018, CA: a cancer journal for clinicians.

[32]  Doina Precup,et al.  Exploring Uncertainty Measures in Deep Networks for Multiple Sclerosis Lesion Detection and Segmentation , 2018, MICCAI.

[33]  Klaus H. Maier-Hein,et al.  A Probabilistic U-Net for Segmentation of Ambiguous Images , 2018, NeurIPS.

[34]  Max Welling,et al.  Attention-based Deep Multiple Instance Learning , 2018, ICML.

[35]  Eric W. Tramel,et al.  Classification and Disease Localization in Histopathology Using Only Global Labels: A Weakly-Supervised Approach , 2018, ArXiv.

[36]  Andrew H. Beck,et al.  Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer , 2017, JAMA.

[37]  A. Kaz,et al.  Anal intraepithelial neoplasia: A review of diagnosis and management , 2017, World journal of gastrointestinal oncology.

[38]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[39]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Jayalakshmi Kumarswamy,et al.  Inter- and intra-observer variability in three grading systems for oral epithelial dysplasia , 2016, Journal of oral and maxillofacial pathology : JOMFP.

[41]  W. Grolman,et al.  Grade of dysplasia and malignant transformation in adults with premalignant laryngeal lesions , 2016, Head & neck.

[42]  Benjamin Van Roy,et al.  Deep Exploration via Bootstrapped DQN , 2016, NIPS.

[43]  Hsuan-Tien Lin,et al.  Cost-Aware Pre-Training for Multiclass Cost-Sensitive Deep Learning , 2015, IJCAI.

[44]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[45]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[46]  N. Gale,et al.  Evaluation of a new grading system for laryngeal squamous intraepithelial lesions—a proposed unified classification , 2014, Histopathology.

[47]  Honggang Liu,et al.  Diagnostic Variability of Laryngeal Premalignant Lesions , 2014, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[48]  W. Westra The Morphologic Profile of HPV-Related Head and Neck Squamous Carcinoma: Implications for Diagnosis, Prognosis, and Clinical Management , 2012, Head and Neck Pathology.

[49]  Grigorios Tsoumakas,et al.  On the Stratification of Multi-label Data , 2011, ECML/PKDD.

[50]  E. Speel,et al.  Interobserver variability of laryngeal mucosal premalignant lesions: a histopathological evaluation , 2011, Modern Pathology.

[51]  U. Pabuçcuoǧlu,et al.  Inter-observer Agreement in Laryngeal Pre-neoplastic Lesions , 2010, Head and Neck Pathology.

[52]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[53]  O. Kleinsasser [The classification and differential diagnosis of epithelial hyperplasia of the laryngeal mucosa on the basis of histomorphological features. II]. , 1963, Zeitschrift fur Laryngologie, Rhinologie, Otologie und ihre Grenzgebiete.

[54]  Marleen de Bruijne,et al.  Quantitative Comparison of Monte-Carlo Dropout Uncertainty Measures for Multi-class Segmentation , 2020, UNSURE/GRAIL@MICCAI.

[55]  Søren Hauberg,et al.  Can You Trust Predictive Uncertainty Under Real Dataset Shifts in Digital Pathology? , 2020, MICCAI.

[56]  J. Grandis,et al.  WHO classification of head and neck tumours , 2017 .

[57]  Ian Osband,et al.  Risk versus Uncertainty in Deep Learning: Bayes, Bootstrap and the Dangers of Dropout , 2016 .

[58]  Daniel Rueckert,et al.  Medical Image Computing and Computer-Assisted Intervention − MICCAI 2017: 20th International Conference, Quebec City, QC, Canada, September 11-13, 2017, Proceedings, Part II , 2017, Lecture Notes in Computer Science.

[59]  K. Riden,et al.  CLASSIFICATION OF HEAD AND NECK TUMOURS , 1998 .