Pan-Cancer Integrative Histology-Genomic Analysis via Interpretable Multimodal Deep Learning

Cancer prognostication and therapeutic response prediction is driven by prognostic markers in both histopathology slides and molecular profiles. The rapidly emerging field of deep learning-based computational pathology has demonstrated promise in developing objective prognostic models from histology whole slide images. Several studies have focused on predicting genomic and transcriptomic alterations from histology images. However, most prognostic models are either based on histology or genomics alone and do not address how histology and genomics can be integrated to develop joint image-omic assays and prognostic models. To overcome these challanges, we present a pan-cancer integrative platform for biomarker discovery in both histology slides and molecular profile data (http://pancancer.ai). We used corresponding gigapixel whole-slide images, RNA-Seq abundance, copy number variation, and mutation data from 5,720 patients across 14 cancer types to train a weakly supervised, interpretable, multimodal deep learning algorithm that is able to not only fuse these heterogenous modalities for survival analysis, but also discover prognostic features from these modalities that corroborate with poor and favorable survival outcomes using multimodal interpretability. In a 5-fold cross validation, we compared our model with unimodal deep learning models trained on histology slides and molecular profiles alone, and demonstrate performance increase in risk stratification on 9 out of 14 cancers, as well as shifts in feature importance when conditioned on more than one modality. To validate our setup as a platform for driving biomarker discovery, we analyzed both morphological and molecular features that were identified by high attribution regions in every patient, and discovered statistically significant associations between high tumor-infilitrating lymphocyte (TIL) presence in HE 2021 Jan 13-14. Philadelphia (PA): AACR; Clin Cancer Res 2021;27(5_Suppl):Abstract nr PO-002.

[1]  Giorgos Borboudakis,et al.  Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation , 2017, Machine Learning.

[2]  J. Sneep,et al.  With a summary , 1945 .

[3]  Andrew H. Beck,et al.  Systematic Analysis of Breast Cancer Morphology Uncovers Stromal Features Associated with Survival , 2011, Science Translational Medicine.

[4]  Ming Y. Lu,et al.  Semi-Supervised Histology Classification using Deep Multiple Instance Learning and Contrastive Predictive Coding , 2019, ArXiv.

[5]  C. Lindskog,et al.  A pathology atlas of the human cancer transcriptome , 2017, Science.

[6]  Ming Y. Lu,et al.  Federated learning for computational pathology on gigapixel whole slide images , 2020, Medical Image Anal..

[7]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[8]  Ming Y. Lu,et al.  Weakly Supervised Prostate Tma Classification Via Graph Convolutional Networks , 2019, 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI).

[9]  D. Brat,et al.  Predicting cancer outcomes from histology and genomics using convolutional networks , 2017, Proceedings of the National Academy of Sciences.

[10]  Ming Y. Lu,et al.  Whole Slide Images are 2D Point Clouds: Context-Aware Survival Prediction using Patch-based Graph Convolutional Networks , 2021, MICCAI.

[11]  C. Lindskog,et al.  A genome-wide transcriptomic analysis of protein-coding genes in human blood cells , 2019, Science.

[12]  Andrew H. Beck,et al.  Human-interpretable image features derived from densely mapped cancer pathology slides predict diverse molecular phenotypes , 2021, Nature Communications.

[13]  Jakob Nikolas Kather,et al.  Pan-cancer image-based detection of clinically actionable genetic alterations , 2019, Nature Cancer.

[14]  Ellery Wulczyn,et al.  Deep learning-based survival prediction for multiple cancer types using histopathology images , 2019, PloS one.

[15]  Alexander W. Jung,et al.  Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis , 2019, Nature Cancer.

[16]  Angela E. Leek,et al.  Geospatial immune variability illuminates differential evolution of lung adenocarcinoma , 2020, Nature Medicine.

[17]  Jin Tae Kwak,et al.  Hover-Net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images , 2018, Medical Image Anal..

[18]  Hugh Chen,et al.  From local explanations to global understanding with explainable AI for trees , 2020, Nature Machine Intelligence.

[19]  G. Reifenberger,et al.  The 2016 World Health Organization Classification of Tumors of the Central Nervous System: a summary , 2016, Acta Neuropathologica.

[20]  Insuk Sohn,et al.  Spontaneous mutations in the single TTN gene represent high tumor mutation burden , 2020, npj Genomic Medicine.

[21]  Lei Zhang,et al.  Exploring prognostic indicators in the pathological images of hepatocellular carcinoma based on deep learning , 2020, Gut.

[22]  Ce Zhang,et al.  Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features , 2016, Nature Communications.

[23]  Michael B. Stadler,et al.  An Immune Atlas of Clear Cell Renal Cell Carcinoma , 2017, Cell.

[24]  Carlo C. Maley,et al.  An ecological measure of immune-cancer colocalization as a prognostic factor for breast cancer , 2015, Breast Cancer Research.

[25]  Holger Moch,et al.  The single-cell pathology landscape of breast cancer , 2020, Nature.

[26]  Joshua E. Lewis,et al.  Predicting clinical outcomes from large scale cancer genomic profiles with deep survival models , 2017, Scientific Reports.

[27]  Jakob Nikolas Kather,et al.  Deep learning in cancer pathology: a new generation of clinical biomarkers , 2020, British Journal of Cancer.

[28]  F. Galateau-Sallé,et al.  The 2015 World Health Organization Classification of Tumors of the Pleura: Advances since the 2004 Classification. , 2016, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[29]  Douglas G Altman,et al.  The logrank test , 2004, BMJ : British Medical Journal.

[30]  Rajarsi R. Gupta,et al.  Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images. , 2018, Cell reports.

[31]  Ming Y. Lu,et al.  Data-efficient and weakly supervised computational pathology on whole-slide images , 2020, Nature Biomedical Engineering.

[32]  Matthias Schmid,et al.  Bias in Cross-Entropy-Based Training of Deep Survival Networks , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Steven A. Roberts,et al.  Mutational heterogeneity in cancer and the search for new cancer-associated genes , 2013 .

[34]  Erik Cambria,et al.  Tensor Fusion Network for Multimodal Sentiment Analysis , 2017, EMNLP.

[35]  Martin L. Miller,et al.  Mutational landscape determines sensitivity to PD-1 blockade in non–small cell lung cancer , 2015, Science.

[36]  Nasir M. Rajpoot,et al.  A Novel Digital Score for Abundance of Tumour Infiltrating Lymphocytes Predicts Disease Free Survival in Oral Squamous Cell Carcinoma , 2019, Scientific Reports.

[37]  Sarah A. Teichmann,et al.  Faculty Opinions recommendation of histoCAT: analysis of cell phenotypes and interactions in multiplex image cytometry data. , 2017 .

[38]  Nico Karssemeijer,et al.  Deep learning-based assessment of tumor-associated stroma for diagnosing breast cancer in histopathology images , 2017, 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017).

[39]  G. Wainrib,et al.  Deep learning-based classification of mesothelioma improves prediction of patient outcome , 2019, Nature Medicine.

[40]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[41]  The Cancer Genome Atlas Research Network COMPREHENSIVE MOLECULAR CHARACTERIZATION OF CLEAR CELL RENAL CELL CARCINOMA , 2013, Nature.

[42]  Patrik L. Ståhl,et al.  Spatial maps of prostate cancer transcriptomes reveal an unexplored landscape of heterogeneity , 2018, Nature Communications.

[43]  Louis-Philippe Morency,et al.  Multimodal Machine Learning: A Survey and Taxonomy , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  D. Haussler,et al.  The Somatic Genomic Landscape of Glioblastoma , 2013, Cell.

[45]  Maya Petersen,et al.  Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates. , 2015, Electronic journal of statistics.

[46]  Ming Y. Lu,et al.  AI-based pathology predicts origins for cancers of unknown primary , 2020, Nature.

[47]  Thomas J. Fuchs,et al.  Clinical-grade computational pathology using weakly supervised deep learning on whole slide images , 2019, Nature Medicine.

[48]  J. Buhmann,et al.  Highly multiplexed imaging of tumor tissues with subcellular resolution by mass cytometry , 2014, Nature Methods.

[49]  Boudewijn P F Lelieveldt,et al.  Data-driven identification of prognostic tumor subpopulations using spatially mapped t-SNE of mass spectrometry imaging data , 2016, Proceedings of the National Academy of Sciences.

[50]  Max Welling,et al.  Attention-based Deep Multiple Instance Learning , 2018, ICML.

[51]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[52]  Bernd Bodenmiller,et al.  miCAT: A toolbox for analysis of cell phenotypes and interactions in multiplex image cytometry data , 2017, Nature Methods.

[53]  Steven J. M. Jones,et al.  The Immune Landscape of Cancer , 2018, Immunity.

[54]  Bilal Alsallakh,et al.  Captum: A unified and generic model interpretability library for PyTorch , 2020, ArXiv.

[55]  Ming Y. Lu,et al.  Pathomic Fusion: An Integrated Framework for Fusing Histopathology and Genomic Features for Cancer Diagnosis and Prognosis , 2019, IEEE Transactions on Medical Imaging.

[56]  G. von Heijne,et al.  Tissue-based map of the human proteome , 2015, Science.