A deep learning model to predict RNA-Seq expression of tumours from whole slide images

Deep learning methods for digital pathology analysis are an effective way to address multiple clinical questions, from diagnosis to prediction of treatment outcomes. These methods have also been used to predict gene mutations from pathology images, but no comprehensive evaluation of their potential for extracting molecular features from histology slides has yet been performed. We show that HE2RNA, a model based on the integration of multiple data modes, can be trained to systematically predict RNA-Seq profiles from whole-slide images alone, without expert annotation. Through its interpretable design, HE2RNA provides virtual spatialization of gene expression, as validated by CD3- and CD20-staining on an independent dataset. The transcriptomic representation learned by HE2RNA can also be transferred on other datasets, even of small size, to increase prediction performance for specific molecular phenotypes. We illustrate the use of this approach in clinical diagnosis purposes such as the identification of tumors with microsatellite instability. RNA-sequencing of tumour tissue can provide important diagnostic and prognostic information but this is costly and not routinely performed in all clinical settings. Here, the authors show that whole slide histology slides—part of routine care—can be used to predict RNA-sequencing data and thus reduce the need for additional analyses.

[1]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Rajarsi R. Gupta,et al.  Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images. , 2018, Cell reports.

[3]  Ash A. Alizadeh,et al.  Abstract PR09: The prognostic landscape of genes and infiltrating immune cells across human cancers , 2015 .

[4]  Bram van Ginneken,et al.  Epithelium segmentation using deep learning in H&E-stained prostate specimens with immunohistochemistry as reference standard , 2018, Scientific Reports.

[5]  Clive R. Taylor,et al.  Whole Slide Imaging Versus Microscopy for Primary Diagnosis in Surgical Pathology , 2017, The American journal of surgical pathology.

[6]  Rosamaria Pinto,et al.  Next-generation sequencing: advances and applications in cancer diagnosis , 2016, OncoTargets and therapy.

[7]  Erin L. Schenk,et al.  Targeting the Complement Pathway as a Therapeutic Strategy in Lung Cancer , 2019, Front. Immunol..

[8]  E. Lai,et al.  Ki-67 antigen expression in hepatocellular carcinoma using monoclonal antibody MIB1. A comparison with proliferating cell nuclear antigen. , 1995, American journal of clinical pathology.

[9]  Robert L. Sutherland,et al.  Cyclins and Breast Cancer , 1996, Journal of Mammary Gland Biology and Neoplasia.

[10]  D. Ruderman,et al.  Correlating nuclear morphometric patterns with estrogen receptor status in breast cancer pathologic specimens , 2018, npj Breast Cancer.

[11]  P. Baldi,et al.  Deep-Learning Convolutional Neural Networks Accurately Classify Genetic Mutations in Gliomas , 2018, American Journal of Neuroradiology.

[12]  Jakob Nikolas Kather,et al.  Genomics and emerging biomarkers for immunotherapy of colorectal cancer. , 2018, Seminars in cancer biology.

[13]  Jeffrey H. Chuang,et al.  Pan-cancer classifications of tumor histological images using deep learning , 2019, bioRxiv.

[14]  Dmitry I. Strokotov,et al.  Is there a difference between T- and B-lymphocyte morphology? , 2009, Journal of biomedical optics.

[15]  Andrew H. Beck,et al.  Systematic Analysis of Breast Cancer Morphology Uncovers Stromal Features Associated with Survival , 2011, Science Translational Medicine.

[16]  Oumeima Laifa,et al.  Predicting Survival After Hepatocellular Carcinoma Resection Using Deep Learning on Histological Slides , 2020, Hepatology.

[17]  A. Ozcan,et al.  Virtual histological staining of unlabelled tissue-autofluorescence images via deep learning , 2018, Nature Biomedical Engineering.

[18]  Bram van Ginneken,et al.  Automated Gleason Grading of Prostate Biopsies using Deep Learning , 2019, ArXiv.

[19]  Gianluca Bontempi,et al.  TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data , 2015, Nucleic acids research.

[20]  Ludmila V. Danilova,et al.  Mismatch repair deficiency predicts response of solid tumors to PD-1 blockade , 2017, Science.

[21]  Steven J. M. Jones,et al.  Comprehensive molecular characterization of human colon and rectal cancer , 2012, Nature.

[22]  Ajay Goel,et al.  Microsatellite instability in colorectal cancer. , 2010, Gastroenterology.

[23]  Daniel J. Gaffney,et al.  A survey of best practices for RNA-seq data analysis , 2016, Genome Biology.

[24]  M. Inngjerdingen,et al.  The Tetraspanin CD53 Modulates Responses from Activating NK Cell Receptors, Promoting LFA-1 Activation and Dampening NK Cell Effector Functions , 2014, PloS one.

[25]  R. Kamps,et al.  Next-Generation Sequencing in Oncology: Genetic Diagnosis, Risk Prediction and Cancer Classification , 2017, International journal of molecular sciences.

[26]  Joel H. Saltz,et al.  Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  J. Bruix,et al.  Prognosis of Hepatocellular Carcinoma: The BCLC Staging Classification , 1999, Seminars in liver disease.

[28]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[29]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[30]  Michael R Stratton,et al.  Genomics and the continuum of cancer care. , 2011, The New England journal of medicine.

[31]  P. Keegan,et al.  First FDA Approval Agnostic of Cancer Site - When a Biomarker Defines the Indication. , 2017, The New England journal of medicine.

[32]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[33]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  M. Sliwkowski,et al.  Association of Csk-homologous Kinase (CHK) (formerly MATK) with HER-2/ErbB-2 in Breast Cancer Cells* , 1997, Journal of Biological Chemistry.

[35]  H. Honda,et al.  Small hepatocellular carcinoma of single nodular type: A specific reference to its surrounding cancerous area undetected radiologically and macroscopically , 1995, Journal of surgical oncology.

[36]  B. van Ginneken,et al.  Automated deep-learning system for Gleason grading of prostate cancer using biopsies: a diagnostic study. , 2020, The Lancet. Oncology.

[37]  K. Shirabe,et al.  A long‐term survivor of ruptured hepatocellular carcinoma after hepatic resection , 1995, Journal of gastroenterology and hepatology.

[38]  Peter Bankhead,et al.  QuPath: Open source software for digital pathology image analysis , 2017, Scientific Reports.

[39]  N. Kedersha,et al.  Characterization of GMP-17, a granule membrane protein that moves to the plasma membrane of natural killer cells following target cell recognition. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Navid Farahani,et al.  A Practical Guide to Whole Slide Imaging: A White Paper From the Digital Pathology Association. , 2018, Archives of pathology & laboratory medicine.

[41]  R. Pal,et al.  Send Orders of Reprints at Reprints@benthamscience.net Integrated Analysis of Transcriptomic and Proteomic Data , 2022 .

[42]  Yi-huan Luo,et al.  Clinicopathological and prognostic significance of high Ki-67 labeling index in hepatocellular carcinoma patients: a meta-analysis. , 2015, International journal of clinical and experimental medicine.

[43]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[44]  I. Papasotiriou,et al.  Current perspectives on CHEK2 mutations in breast cancer , 2017, Breast cancer.

[45]  Constantino Carlos Reyes-Aldasoro,et al.  Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study , 2019, PLoS medicine.

[46]  Ernst J. Wolvetang,et al.  Bone Disease - Current Knowledge and Future Prospects , 2001 .

[47]  Qi Zhou,et al.  CD19 and CD20 Targeted Vectors Induce Minimal Activation of Resting B Lymphocytes , 2013, PloS one.

[48]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[49]  Jakob Nikolas Kather,et al.  Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer , 2019, Nature Medicine.

[50]  N. Razavian,et al.  Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning , 2018, Nature Medicine.

[51]  D. Koller,et al.  From signatures to models: understanding cancer using microarrays , 2005, Nature Genetics.

[52]  D. Brat,et al.  Predicting cancer outcomes from histology and genomics using convolutional networks , 2017, Proceedings of the National Academy of Sciences.

[53]  Michael C. Montalto,et al.  And They Said It Couldn’t Be Done: Predicting Known Driver Mutations From H&E Slides , 2019, Journal of pathology informatics.

[54]  P. Banks,et al.  Novel predators emit novel cues: a mechanism for prey naivety towards alien predators , 2017, Scientific Reports.

[55]  Andrew J. Schaumberg,et al.  D R A F T H&E-stained Whole Slide Image Deep Learning Predicts SPOP Mutation State in Prostate Cancer , 2017 .

[56]  Angel Cruz-Roa,et al.  Mitosis detection in breast cancer pathology images by combining handcrafted and convolutional neural network features , 2014, Journal of medical imaging.

[57]  S. Park,et al.  Deep transfer learning approach to predict tumor mutation burden (TMB) and delineate spatial heterogeneity of TMB within tumors from whole slide images , 2019, bioRxiv.

[58]  P. Park,et al.  A molecular portrait of microsatellite instability across multiple cancers , 2016, Nature Communications.

[59]  A. Madabhushi,et al.  Quantitative nuclear histomorphometry predicts oncotype DX risk categories for early stage ER+ breast cancer , 2018, BMC Cancer.

[60]  M. Stratton,et al.  The cancer genome , 2009, Nature.

[61]  Mark M. Davis,et al.  Identification and sequence of a fourth human T cell antigen receptor chain , 1987, Nature.

[62]  S Srivastava,et al.  A National Cancer Institute Workshop on Microsatellite Instability for cancer detection and familial predisposition: development of international criteria for the determination of microsatellite instability in colorectal cancer. , 1998, Cancer research.

[63]  M. Kudo,et al.  Molecular Link between Liver Fibrosis and Hepatocellular Carcinoma , 2013, Liver Cancer.

[64]  N. Linder,et al.  Antibody-supervised deep learning for quantification of tumor-infiltrating immune cells in hematoxylin and eosin stained breast cancer samples , 2016, Journal of pathology informatics.

[65]  S. Nair,et al.  Cell-Type-Specific Gene Expression Profiling in Adult Mouse Brain Reveals Normal and Disease-State Signatures. , 2019, Cell reports.

[66]  E. Lander Array of hope , 1999, Nature Genetics.

[67]  Aung Ko Win,et al.  Colorectal and other cancer risks for carriers and noncarriers from families with a DNA mismatch repair gene mutation: a prospective cohort study. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[68]  M. McCall,et al.  Systematic exploration of cell morphological phenotypes associated with a transcriptomic query , 2018, Nucleic acids research.