Collaborative Learning of Cross-channel Clinical Attention for Radiotherapy-Related Esophageal Fistula Prediction from CT

Early prognosis of the radiotherapy-related esophageal fistula is of great significance in making personalized stratification and optimal treatment plans for esophageal cancer (EC) patients. The effective fusion of diagnostic consideration guided multi-level radiographic visual descriptors is a challenging task. We propose an end-to-end clinical knowledge enhanced multi-level cross-channel feature extraction and aggregation model. Firstly, clinical attention is represented by contextual CT, segmented tumor and anatomical surroundings from nine views of planes. Then for each view, a Cross-Channel-Atten Network is proposed with CNN blocks for multi-level feature extraction, cross-channel convolution module for multi-domain clinical knowledge embedding at the same feature level, and attentional mechanism for the final adaptive fusion of multi-level cross-domain radiographic features. The experimental results and ablation study on 558 EC patients showed that our model outperformed the other methods in comparison with or without multi-view, multi-domain knowledge, and multi-level attentional features. Visual analysis of attention maps shows that the network learns to focus on tumor and organs of interests, including esophagus, trachea, and mediastinal connective tissues.

[1]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Ruiwei Feng,et al.  Multi-view Learning with Feature Level Fusion for Cervical Dysplasia Diagnosis , 2019, MICCAI.

[3]  Chi-Wing Fu,et al.  Depth-Attentional Features for Single-Image Rain Removal , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Ping Zhang,et al.  Salvage radiotherapy in patients with local recurrent esophageal cancer after radical radiochemotherapy , 2015, Radiation oncology.

[5]  Philip H. S. Torr,et al.  Learn To Pay Attention , 2018, ICLR.

[6]  Jose Pablo Diaz-Jimenez,et al.  Malignant respiratory–digestive fistulas , 2010, Current opinion in pulmonary medicine.

[7]  Ben Glocker,et al.  Attention Gated Networks: Learning to Leverage Salient Regions in Medical Images , 2018, Medical Image Anal..

[8]  Adam P. Harrison,et al.  Accurate Esophageal Gross Tumor Volume Segmentation in PET/CT using Two-Stream Chained 3D Deep Network Fusion , 2019, MICCAI.

[9]  Bowen Xin,et al.  Integrative nomogram of CT imaging, clinical, and hematological features for survival prediction of patients with locally advanced non-small cell lung cancer , 2019, European Radiology.

[10]  Yang Zhang,et al.  Risk factors for esophageal fistula in patients with locally advanced esophageal carcinoma receiving chemoradiotherapy , 2018, OncoTargets and therapy.

[11]  Chunfeng Lian,et al.  Automated detection and classification of thyroid nodules in ultrasound images using clinical-knowledge-guided convolutional neural networks , 2019, Medical Image Anal..

[12]  Wen-Sheng Huang,et al.  Deep Convolutional Neural Network-Based Positron Emission Tomography Analysis Predicts Esophageal Cancer Outcome , 2019, Journal of clinical medicine.

[13]  Weidong Cai,et al.  Knowledge-based Collaborative Deep Learning for Benign-Malignant Lung Nodule Classification on Chest CT , 2019, IEEE Transactions on Medical Imaging.

[14]  Raymond Y Huang,et al.  Artificial intelligence in cancer imaging: Clinical challenges and applications , 2019, CA: a cancer journal for clinicians.

[15]  Xindong Sun,et al.  Development and validation of a risk prediction model for radiotherapy-related esophageal fistula in esophageal cancer , 2019, Radiation Oncology.

[16]  M. Ye,et al.  Esophageal perforation during or after conformal radiotherapy for esophageal carcinoma , 2014, Journal of radiation research.

[17]  C. Mathers,et al.  Cancer incidence and mortality worldwide: Sources, methods and major patterns in GLOBOCAN 2012 , 2015, International journal of cancer.

[18]  Kuan-Lun Tseng,et al.  Joint Sequence Learning and Cross-Modality Convolution for 3D Biomedical Segmentation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).