Correlation Analysis of Histopathology and Proteogenomics Data for Breast Cancer*

Histopathology images are important for cancer diagnosis and prognosis. We extracted quantitative morphological features from breast cancers images and systematically analyzed their relationships with proteins and mRNAs. We observed concordant correlation patterns between image-protein and image-RNA and identified four cancer-related biological processes associated with morphological features related to different tumor components. Further, we observed that proteomic data reveal unique protein-related biological processes associated with morphology. Finally, prognostic morphological features were identified, and their roles are consistent with the underlying biological processes. Graphical Abstract Highlights Consistent correlation patterns between image-protein and image-mRNA at genome level. Four major biological processes associated with cellular and tissue morphology. Proteomic data reveal protein-specific biology processes associated with morphology. Morphological features can predict survival with relevant molecular events. Tumors are heterogeneous tissues with different types of cells such as cancer cells, fibroblasts, and lymphocytes. Although the morphological features of tumors are critical for cancer diagnosis and prognosis, the underlying molecular events and genes for tumor morphology are far from being clear. With the advancement in computational pathology and accumulation of large amount of cancer samples with matched molecular and histopathology data, researchers can carry out integrative analysis to investigate this issue. In this study, we systematically examine the relationships between morphological features and various molecular data in breast cancers. Specifically, we identified 73 breast cancer patients from the TCGA and CPTAC projects matched whole slide images, RNA-seq, and proteomic data. By calculating 100 different morphological features and correlating them with the transcriptomic and proteomic data, we inferred four major biological processes associated with various interpretable morphological features. These processes include metabolism, cell cycle, immune response, and extracellular matrix development, which are all hallmarks of cancers and the associated morphological features are related to area, density, and shapes of epithelial cells, fibroblasts, and lymphocytes. In addition, protein specific biological processes were inferred solely from proteomic data, suggesting the importance of proteomic data in obtaining a holistic understanding of the molecular basis for tumor tissue morphology. Furthermore, survival analysis yielded specific morphological features related to patient prognosis, which have a strong association with important molecular events based on our analysis. Overall, our study demonstrated the power for integrating multiple types of biological data for cancer samples in generating new hypothesis as well as identifying potential biomarkers predicting patient outcome. Future work includes causal analysis to identify key regulators for cancer tissue development and validating the findings using more independent data sets.

[1]  K. H. White,et al.  Nuclear size regulation: from single cells to development and disease. , 2013, Trends in cell biology.

[2]  Michael L. Gatza,et al.  Proteogenomics connects somatic mutations to signaling in breast cancer , 2016, Nature.

[3]  Andrew H. Beck,et al.  Systematic Analysis of Breast Cancer Morphology Uncovers Stromal Features Associated with Survival , 2011, Science Translational Medicine.

[4]  Qianjin Feng,et al.  Identification of topological features in renal tumor microenvironment associated with patient survival , 2017, Bioinform..

[5]  Peng Qiu,et al.  TCGA-Assembler: open-source software for retrieving and processing TCGA data , 2014, Nature Methods.

[6]  Qianjin Feng,et al.  Integrative Analysis of Histopathological Images and Genomic Data Predicts Clear Cell Renal Cell Carcinoma Prognosis. , 2017, Cancer research.

[7]  R. Weinberg,et al.  Understanding the tumor immune microenvironment (TIME) for effective therapy , 2018, Nature Medicine.

[8]  M. Nowak,et al.  Dynamics of cancer progression , 2004, Nature Reviews Cancer.

[9]  Mary Goldman,et al.  The UCSC Xena platform for public and private cancer genomics data visualization and interpretation , 2018, bioRxiv.

[10]  T. Murdoch,et al.  The inevitable application of big data to health care. , 2013, JAMA.

[11]  H. Moses,et al.  Tumor-stroma interactions. , 2005, Current opinion in genetics & development.

[12]  L. Emens Breast Cancer Immunotherapy: Facts and Hopes , 2017, Clinical Cancer Research.

[13]  Sidra Nawaz,et al.  Mapping spatial heterogeneity in the tumor microenvironment: a new era for digital pathology , 2015, Laboratory Investigation.

[14]  Giulio Gabbiani,et al.  The stroma reaction myofibroblast: a key player in the control of tumor cell behavior. , 2004, The International journal of developmental biology.

[15]  T. Hagemann,et al.  The tumor microenvironment at a glance , 2012, Journal of Cell Science.

[16]  David Gomez-Cabrero,et al.  Data integration in the era of omics: current and future challenges , 2014, BMC Systems Biology.

[17]  A. Barrientos,et al.  Mitochondrial ribosomes in cancer. , 2017, Seminars in cancer biology.

[18]  U. Kayisli,et al.  Extracellular Matrix-Dependent Regulation of Fas Ligand Expression in Human Endometrial Stromal Cells , 2002, Biology of reproduction.

[19]  C. Lindskog,et al.  A pathology atlas of the human cancer transcriptome , 2017, Science.

[20]  Yuan Ji,et al.  TCGA-Assembler 2: Software Pipeline for Retrieval and Processing of TCGA/CPTAC Data , 2017, bioRxiv.

[21]  Peter W. Laird,et al.  Cell-of-Origin Patterns Dominate the Molecular Classification of 10,000 Tumors from 33 Types of Cancer , 2018, Cell.

[22]  F. Markowetz,et al.  Quantitative Image Analysis of Cellular Heterogeneity in Breast Tumors Complements Genomic Profiling , 2012, Science Translational Medicine.

[23]  Yinyin Yuan Spatial Heterogeneity in the Tumor Microenvironment. , 2016, Cold Spring Harbor perspectives in medicine.

[24]  M. Dehghani,et al.  Review of cancer from perspective of molecular , 2017 .

[25]  Jean Qiu,et al.  The metabolic demands of cancer cells are coupled to their size and protein synthesis rates , 2013, Cancer & Metabolism.

[26]  M. Nykter,et al.  Integrative proteomics in prostate cancer uncovers robustness against genomic and transcriptomic aberrations during disease progression , 2018, Nature Communications.

[27]  Chao Wang,et al.  Identifying survival associated morphological features of triple negative breast cancer using multiple datasets , 2013, Journal of the American Medical Informatics Association : JAMIA.

[28]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumours , 2013 .

[29]  Stef van Buuren,et al.  MICE: Multivariate Imputation by Chained Equations in R , 2011 .

[30]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[31]  K. Pienta,et al.  Targeting the tumour stroma to improve cancer therapy , 2018, Nature Reviews Clinical Oncology.

[32]  Daoqiang Zhang,et al.  Ordinal Multi-modal Feature Selection for Survival Analysis of Early-Stage Renal Cancer , 2018, MICCAI.

[33]  C. Lynch Big data: How do your data grow? , 2008, Nature.

[34]  Jing Chen,et al.  ToppGene Suite for gene list enrichment analysis and candidate gene prioritization , 2009, Nucleic Acids Res..

[35]  Raghu Machiraju,et al.  Breast cancer patient stratification using a molecular regularized consensus clustering method. , 2014, Methods.

[36]  Ø. Borgan,et al.  Modeling Survival Data: Extending the Cox Model. Terry M. Therneau and Patricia M. Grambsch, Springer-Verlag, New York, 2000. No. of pages: xiii + 350. Price: $69.95. ISBN 0-387-98784-3 , 2001 .

[37]  C. Catoi,et al.  TUMOR CELL MORPHOLOGY , 2007 .

[38]  Z. Werb,et al.  Remodelling the extracellular matrix in development and disease , 2014, Nature Reviews Molecular Cell Biology.

[39]  E. Winer,et al.  Atezolizumab and Nab‐Paclitaxel in Advanced Triple‐Negative Breast Cancer , 2018, The New England journal of medicine.

[40]  Wei Xiong,et al.  Role of tumor microenvironment in tumorigenesis , 2017, Journal of Cancer.

[41]  R. W. Scarff,et al.  THE POSITION OF HISTOLOGY IN THE PROGNOSIS OF CARCINOMA OF THE BREAST. , 1928 .

[42]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[43]  R. Gray Modeling Survival Data: Extending the Cox Model , 2002 .

[44]  Allison P. Heath,et al.  Toward a Shared Vision for Cancer Genomic Data. , 2016, The New England journal of medicine.

[45]  J. Bertram The molecular biology of cancer. , 2000 .

[46]  Jun Kong,et al.  Integrated morphologic analysis for the identification and characterization of disease subtypes , 2012, J. Am. Medical Informatics Assoc..

[47]  Nicholas P. Tatonetti,et al.  Translational medicine in the Age of Big Data , 2017, Briefings Bioinform..

[48]  M. Stratton,et al.  The cancer genome , 2009, Nature.

[49]  Carsten Denkert,et al.  Clinical relevance of host immunity in breast cancer: from TILs to the clinic , 2016, Nature Reviews Clinical Oncology.