Deep-learning and transfer learning identify new breast cancer survival subtypes from single-cell imaging data

Quantitative models that explicitly capture single-cell resolution cell-cell interaction features to predict patient survival at population scale are currently missing. Here, we computationally extracted hundreds of features describing single-cell based cell-cell interactions and cellular phenotypes from a large, published cohort of cyto-images of breast cancer patients. We applied these features to a neural-network based Cox-nnet survival model and obtained high accuracy in predicting patient survival in test data (Concordance Index > 0.8). We identified seven survival subtypes using the top survival features, which present distinct profiles of epithelial, immune, fibroblast cells, and their interactions. We identified atypical subpopulations of TNBC patients with moderate prognosis (marked by GATA3 over-expression) and Luminal A patients with poor prognosis (marked by KRT6 and ACTA2 over-expression and CDH1 under-expression). These atypical subpopulations are validated in TCGA-BRCA and METABRIC datasets. This work provides important guidelines on bridging single-cell level information towards population-level survival prediction. STATEMENT OF TRANSLATIONAL RELEVANCE Our findings from a breast cancer population cohort demonstrate the clinical utility of using the single-cell level imaging mass cytometry (IMC) data as a new type of patient prognosis prediction marker. Not only did the prognosis prediction achieve high accuracy with a Concordance index score greater than 0.8, it also enabled the discovery of seven survival subtypes that are more distinguishable than the molecular subtypes. These new subtypes present distinct profiles of epithelial, immune, fibroblast cells, and their interactions. Most importantly, this study identified and validated atypical subpopulations of TNBC patients with moderate prognosis (GATA3 over-expression) and Luminal A patients with poor prognosis (KRT6 and ACTA2 over-expression and CDH1 under-expression), using multiple large breast cancer cohorts.

[1]  P. Schraml,et al.  Multiplex imaging of breast cancer lymph node metastases identifies prognostic single-cell populations independent of clinical classifiers , 2023, Cell reports. Medicine.

[2]  I. Ellis,et al.  Breast tumor microenvironment structures are associated with genomic features and clinical outcome , 2022, Nature Genetics.

[3]  A. Musolino,et al.  Luminal Breast Cancer: Risk of Recurrence and Tumor-Associated Immune Suppression , 2021, Molecular Diagnosis & Therapy.

[4]  A. Mes-Masson,et al.  A Keratin 7 and E-Cadherin Signature Is Highly Predictive of Tubo-Ovarian High-Grade Serous Carcinoma Prognosis , 2021, International journal of molecular sciences.

[5]  Fan Yang,et al.  Comprehensive description of the current breast cancer microenvironment advancements via single-cell analysis , 2021, Journal of Experimental & Clinical Cancer Research.

[6]  G. Christofori,et al.  Breast cancer as an example of tumour heterogeneity and tumour cell plasticity during malignant progression , 2021, British Journal of Cancer.

[7]  A. Jemal,et al.  Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries , 2021, CA: a cancer journal for clinicians.

[8]  M. Westerhoff,et al.  Two-stage Cox-nnet: biologically interpretable neural-network model for prognosis prediction and its application in liver cancer survival using histopathology and transcriptomic data , 2021, NAR genomics and bioinformatics.

[9]  Shan Zhu,et al.  Comprehensive genomic and immunophenotypic analysis of CD4 T cell infiltrating human triple-negative breast cancer , 2020, Cancer Immunology, Immunotherapy.

[10]  Ting Liu,et al.  KRT7 promotes epithelial-mesenchymal transition in ovarian cancer via the TGF-β/Smad2/3 signaling pathway , 2020, Oncology reports.

[11]  G. Sauter,et al.  Diagnostic and prognostic impact of cytokeratin 18 expression in human tumors: a tissue microarray study on 11,952 tumors , 2020, Molecular Medicine.

[12]  Di Wang,et al.  Cox-nnet v2.0: improved neural-network-based survival prediction extended to large-scale EMR data , 2020, Bioinform..

[13]  Kunwei Shen,et al.  Single‐cell RNA sequencing in breast cancer: Understanding tumor heterogeneity and paving roads to individualized therapy , 2020, Cancer communications.

[14]  T. Baumert,et al.  Single-cell genomics and spatial transcriptomics: Discovery of novel cell states and cellular interactions in liver physiology and disease biology , 2020, Journal of hepatology.

[15]  Yiguang Hong,et al.  Unsupervised topological alignment for single-cell multi-omics integration , 2020, bioRxiv.

[16]  Carlos Caldas,et al.  Imaging mass cytometry and multiplatform genomics define the phenogenomic landscape of breast cancer , 2020, Nature Cancer.

[17]  H. Moch,et al.  The single-cell pathology landscape of breast cancer , 2020, Nature.

[18]  A. Ehinger,et al.  Expression of HIF-1α is related to a poor prognosis and tamoxifen resistance in contralateral breast cancer , 2019, PloS one.

[19]  Tonje G. Lien,et al.  An independent poor-prognosis subtype of breast cancer defined by a distinct tumor immune microenvironment , 2019, Nature Communications.

[20]  Heeva Baharlou,et al.  Mass Cytometry Imaging for the Study of Human Diseases—Applications and Data Analysis Strategies , 2019, Front. Immunol..

[21]  Sijia Huang,et al.  DeepProg: an ensemble of deep-learning and machine-learning models for prognosis prediction using multi-omics data , 2021, Genome Medicine.

[22]  M. Uyar,et al.  Prognostic Importance of Ki-67 in Breast Cancer and Its Relationship with Other Prognostic Factors. , 2019, European journal of breast health.

[23]  Yihong Wang,et al.  Cytokeratin 7-negative and GATA binding protein 3-negative breast cancers: Clinicopathological features and prognostic significance , 2019, BMC Cancer.

[24]  Kun Wang,et al.  Identification of a molecular subtyping system associated with the prognosis of Asian hepatocellular carcinoma patients receiving liver resection , 2019, Scientific Reports.

[25]  Franziska Michor,et al.  Unravelling subclonal heterogeneity and aggressive disease states in TNBC through single-cell RNA-seq , 2018, Nature Communications.

[26]  Ambrose J. Carr,et al.  Single-Cell Map of Diverse Immune Phenotypes in the Breast Tumor Microenvironment , 2018, Cell.

[27]  Xun Zhu,et al.  Cox-nnet: An artificial neural network method for prognosis prediction of high-throughput omics data , 2018, PLoS Comput. Biol..

[28]  A. Shaw,et al.  Tumour heterogeneity and resistance to cancer therapies , 2018, Nature Reviews Clinical Oncology.

[29]  Mary E. Edgerton,et al.  Multiclonal Invasion in Breast Tumors Identified by Topographic Single Cell Sequencing , 2018, Cell.

[30]  Xiaosheng Wang,et al.  A Comprehensive Immunologic Portrait of Triple-Negative Breast Cancer , 2017, bioRxiv.

[31]  Jeong Eon Lee,et al.  Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer , 2017, Nature Communications.

[32]  Nahed A. Soliman,et al.  Ki-67 as a prognostic marker according to breast cancer molecular subtype , 2016, Cancer biology & medicine.

[33]  S. Badve,et al.  Tumor Heterogeneity in Breast Cancer , 2015, Advances in anatomic pathology.

[34]  Sean C. Bendall,et al.  Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis , 2015, Cell.

[35]  L. Dossus,et al.  Lobular breast cancer: incidence and genetic and non-genetic risk factors , 2015, Breast Cancer Research.

[36]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[37]  Roland Eils,et al.  circlize implements and enhances circular visualization in R , 2014, Bioinform..

[38]  Myung-Soo Kang,et al.  Alpha-Smooth Muscle Actin (ACTA2) Is Required for Metastatic Potential of Human Lung Adenocarcinoma , 2013, Clinical Cancer Research.

[39]  M. Mimeault,et al.  Hypoxia-inducing factors as master regulators of stemness properties and altered metabolism of cancer- and metastasis-initiating cells , 2013, Journal of cellular and molecular medicine.

[40]  F. Markowetz,et al.  The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups , 2012, Nature.

[41]  X. Chen,et al.  Identification of human triple-negative breast cancer subtypes and preclinical models for selection of targeted therapies. , 2011, The Journal of clinical investigation.

[42]  R. Weinberg,et al.  Tumor-host interactions: a far-reaching relationship. , 2010, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[43]  K. Gelmon,et al.  Ki67 in breast cancer: prognostic and predictive potential. , 2010, The Lancet. Oncology.

[44]  Mattias Höglund,et al.  Non-Negative Matrix Factorization for the Analysis of Complex Gene Expression Data: Identification of Clinically Relevant Tumor Subtypes , 2008, Cancer informatics.

[45]  Brian Keith,et al.  Hypoxia-Inducible Factors, Stem Cells, and Cancer , 2007, Cell.

[46]  A. Sica,et al.  p50 nuclear factor-kappaB overexpression in tumor-associated macrophages inhibits M1 inflammatory responses and antitumor resistance. , 2006, Cancer research.

[47]  William J. Mackillop,et al.  The Importance of Prognosis in Cancer Medicine , 2006 .

[48]  Debashis Ghosh,et al.  Identification of GATA3 as a breast cancer prognostic marker by global gene expression meta-analysis. , 2005, Cancer research.

[49]  R. Tibshirani,et al.  Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[50]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[51]  M. Varia,et al.  Proliferation and hypoxia in human squamous cell carcinoma of the cervix: first report of combined immunohistochemical assays. , 1997, International journal of radiation oncology, biology, physics.

[52]  M. Takeichi,et al.  Expression of E-cadherin cell adhesion molecules in human breast cancer tissues and its relationship to metastasis. , 1993, Cancer research.

[53]  Wendy N Erber,et al.  Applications of imaging flow cytometry in the diagnostic assessment of acute leukaemia. , 2017, Methods.

[54]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumours , 2013 .

[55]  D.,et al.  Regression Models and Life-Tables , 2022 .