Integrating spatial gene expression and breast tumour morphology via deep learning

Spatial transcriptomics allows for the measurement of RNA abundance at a high spatial resolution, making it possible to systematically link the morphology of cellular neighbourhoods and spatially localized gene expression. Here, we report the development of a deep learning algorithm for the prediction of local gene expression from haematoxylin-and-eosin-stained histopathology images using a new dataset of 30,612 spatially resolved gene expression data matched to histopathology images from 23 patients with breast cancer. We identified over 100 genes, including known breast cancer biomarkers of intratumoral heterogeneity and the co-localization of tumour growth and immune activation, the expression of which can be predicted from the histopathology images at a resolution of 100 µm. We also show that the algorithm generalizes well to The Cancer Genome Atlas and to other breast cancer gene expression datasets without the need for re-training. Predicting the spatially resolved transcriptome of a tissue directly from tissue images may enable image-based screening for molecular biomarkers with spatial variation. Deep learning can predict spatial variations in gene expression from haematoxylin-and-eosin-stained histopathology images of patients with cancer.

[1]  Aleix Prat Aparicio Comprehensive molecular portraits of human breast tumours , 2012 .

[2]  Qianjin Feng,et al.  Integrative Analysis of Histopathological Images and Genomic Data Predicts Clear Cell Renal Cell Carcinoma Prognosis. , 2017, Cancer research.

[3]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  M. Kurosumi,et al.  Prognostic significance of tumour-infiltrating lymphocytes for oestrogen receptor-negative breast cancer without lymph node metastasis , 2019, Oncology letters.

[5]  Christopher Ré,et al.  Classifying non-small cell lung cancer types and transcriptomic subtypes using convolutional neural networks , 2020, J. Am. Medical Informatics Assoc..

[6]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Robert C. Jones,et al.  Modeling Spatial Correlation of Transcripts with Application to Developing Pancreas , 2018, Scientific Reports.

[8]  Meyke Hermsen,et al.  1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset , 2018, GigaScience.

[9]  Eran Halperin,et al.  Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies , 2016, Nature Methods.

[10]  Geraint Rees,et al.  Clinically applicable deep learning for diagnosis and referral in retinal disease , 2018, Nature Medicine.

[11]  X. Zhuang,et al.  Spatially resolved, highly multiplexed RNA profiling in single cells , 2015, Science.

[12]  Dayong Wang,et al.  Deep Learning for Identifying Metastatic Breast Cancer , 2016, ArXiv.

[13]  Ehsan Kazemi,et al.  Deep Convolutional Neural Networks Enable Discrimination of Heterogeneous Digital Pathology Images , 2017, bioRxiv.

[14]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[15]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[16]  N. Razavian,et al.  Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning , 2018, Nature Medicine.

[17]  Ce Zhang,et al.  Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features , 2016, Nature Communications.

[18]  Matthew E. Zygmont,et al.  MRNA stability and overexpression of fatty acid synthase in human breast cancer cell lines. , 2007, Anticancer research.

[19]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[20]  Guo-Cheng Yuan,et al.  Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH+ , 2019, Nature.

[21]  James Zou,et al.  Intersecting Faces: Non-negative Matrix Factorization With New Guarantees , 2015, ICML.

[22]  N. Shinohara,et al.  Biglycan is a specific marker and an autocrine angiogenic factor of tumour endothelial cells , 2012, British Journal of Cancer.

[23]  Patrik L. Ståhl,et al.  Visualization and analysis of gene expression in tissue sections by spatial transcriptomics , 2016, Science.

[24]  Russ B. Altman,et al.  Classifying Non-Small Cell Lung Cancer Histopathology Types and Transcriptomic Subtypes using Convolutional Neural Networks , 2019, bioRxiv.

[25]  Anne E Carpenter,et al.  CellProfiler: image analysis software for identifying and quantifying cell phenotypes , 2006, Genome Biology.

[26]  Skipper Seabold,et al.  Statsmodels: Econometric and Statistical Modeling with Python , 2010, SciPy.

[27]  Swati Sinha,et al.  Identification of Genomic Targets of Transcription Factor Aebp1 and its role in Survival of Glioma Cells , 2012, Molecular Cancer Research.

[28]  P. A. Futreal,et al.  Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. , 2012, The New England journal of medicine.

[29]  Aleksey Boyko,et al.  Detecting Cancer Metastases on Gigapixel Pathology Images , 2017, ArXiv.

[30]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[31]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Patrik L. Ståhl,et al.  Barcoded solid-phase RNA capture for Spatial Transcriptomics profiling in mammalian tissue sections , 2018, Nature Protocols.

[33]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[34]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[35]  M. Colombo,et al.  Macrophage-derived SPARC bridges tumor cell-extracellular matrix interactions toward metastasis. , 2008, Cancer research.

[36]  George M. Church,et al.  Highly Multiplexed Subcellular RNA Sequencing in Situ , 2014, Science.

[37]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[38]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[39]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection , 2018, J. Open Source Softw..

[41]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[42]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumours , 2013 .

[43]  Anne E Carpenter,et al.  Improved structure, function and compatibility for CellProfiler: modular high-throughput image analysis software , 2011, Bioinform..