PRISM: recovering cell-type-specific expression profiles from individual composite RNA-seq samples

Abstract Motivation A major challenge in analyzing cancer patient transcriptomes is that the tumors are inherently heterogeneous and evolving. We analyzed 214 bulk RNA samples of a longitudinal, prospective ovarian cancer cohort and found that the sample composition changes systematically due to chemotherapy and between the anatomical sites, preventing direct comparison of treatment-naive and treated samples. Results To overcome this, we developed PRISM, a latent statistical framework to simultaneously extract the sample composition and cell-type-specific whole-transcriptome profiles adapted to each individual sample. Our results indicate that the PRISM-derived composition-free transcriptomic profiles and signatures derived from them predict the patient response better than the composite raw bulk data. We validated our findings in independent ovarian cancer and melanoma cohorts, and verified that PRISM accurately estimates the composition and cell-type-specific expression through whole-genome sequencing and RNA in situ hybridization experiments. Availabilityand implementation https://bitbucket.org/anthakki/prism. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  C. Tripodo,et al.  C1q acts in the tumour microenvironment as a cancer-promoting factor independently of complement activation , 2016, Nature Communications.

[2]  Eran Bacharach,et al.  Cell composition analysis of bulk genomics using single cell data , 2019, Nature Methods.

[3]  Charles H. Yoon,et al.  Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq , 2016, Science.

[4]  Deborah Schrag,et al.  Precision Oncology: Who, How, What, When, and When Not? , 2017, American Society of Clinical Oncology educational book. American Society of Clinical Oncology. Annual Meeting.

[5]  Michael J. Birrer,et al.  Identification of molecular markers and signaling pathway in endometrial cancer in Hong Kong Chinese women by genome-wide gene expression profiling , 2007, Oncogene.

[6]  Shawn M. Gillespie,et al.  Single-Cell Transcriptomic Analysis of Primary and Metastatic Tumor Ecosystems in Head and Neck Cancer , 2017, Cell.

[7]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[8]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[9]  A. Regev,et al.  Spatial reconstruction of single-cell gene expression , 2015, Nature Biotechnology.

[10]  Carlos Caldas,et al.  The implications of clonal genome evolution for cancer medicine. , 2013, The New England journal of medicine.

[11]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[12]  A. Regev,et al.  Spatial reconstruction of single-cell gene expression data , 2015 .

[13]  L. Pachter,et al.  Streaming fragment assignment for real-time analysis of sequencing experiments , 2012, Nature Methods.

[14]  Benjamin J. Raphael,et al.  Integrated Genomic Analyses of Ovarian Carcinoma , 2011, Nature.

[15]  Ash A. Alizadeh,et al.  Determining cell-type abundance and expression from bulk tissues with digital cytometry , 2019, Nature Biotechnology.

[16]  L. Schwartz,et al.  New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). , 2009, European journal of cancer.

[17]  G. Stephanopoulos,et al.  A compendium of gene expression in normal human tissues. , 2001, Physiological genomics.

[18]  C. Sessa,et al.  Newly diagnosed and relapsed epithelial ovarian carcinoma: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. , 2018, Annals of oncology : official journal of the European Society for Medical Oncology.

[19]  Shiquan Sun,et al.  An Efficient and Flexible Method for Deconvoluting Bulk RNA-Seq Data with Single-Cell RNA-Seq Data , 2019, Cells.

[20]  Julia Casado,et al.  Anduril 2: upgraded large-scale data integration framework , 2019, Bioinform..

[21]  Gioele La Manno,et al.  Quantitative single-cell RNA-seq with unique molecular identifiers , 2013, Nature Methods.

[22]  Chris Sander,et al.  Emerging landscape of oncogenic signatures across human cancers , 2013, Nature Genetics.

[23]  C. Perou,et al.  Allele-specific copy number analysis of tumors , 2010, Proceedings of the National Academy of Sciences.

[24]  K. Sangkuhl,et al.  Altered Immune Response in Mice Deficient for the G Protein-coupled Receptor GPR34* , 2010, The Journal of Biological Chemistry.

[25]  A. McKenna,et al.  Absolute quantification of somatic DNA alterations in human cancer , 2012, Nature Biotechnology.

[26]  Steven J. M. Jones,et al.  Genomic Classification of Cutaneous Melanoma , 2015, Cell.

[27]  T. Schöneberg,et al.  The G protein-coupled receptor GPR34 - The past 20 years of a grownup. , 2018, Pharmacology & therapeutics.

[28]  Kenneth H. Buetow,et al.  PID: the Pathway Interaction Database , 2008, Nucleic Acids Res..

[29]  John D Lambris,et al.  Is complement good or bad for cancer patients? A new perspective on an old dilemma. , 2009, Trends in immunology.

[30]  A. Jemal,et al.  Ovarian cancer statistics, 2018 , 2018, CA: a cancer journal for clinicians.

[31]  Davis J. McCarthy,et al.  Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation , 2012, Nucleic acids research.

[32]  A. Butte,et al.  Systematic pan-cancer analysis of tumour purity , 2015, Nature Communications.

[33]  Ash A. Alizadeh,et al.  Robust enumeration of cell subsets from tissue expression profiles , 2015, Nature Methods.

[34]  Ping Chen,et al.  SePIA: RNA and small RNA sequence processing, integration, and analysis , 2016, BioData Mining.

[35]  Edda Klipp,et al.  Estimation of immune cell content in tumour tissue using single-cell RNA-seq data , 2017, Nature Communications.

[36]  E. Yang,et al.  The Pros and Cons of Incorporating Transcriptomics in the Age of Precision Oncology. , 2019, Journal of the National Cancer Institute.

[37]  Anne E Carpenter,et al.  Improved structure, function and compatibility for CellProfiler: modular high-throughput image analysis software , 2011, Bioinform..

[38]  K. Cibulskis,et al.  Prognostically relevant gene signatures of high-grade serous ovarian carcinoma. , 2012, The Journal of clinical investigation.

[39]  Konrad J. Karczewski,et al.  Integrative omics for health and disease , 2018, Nature Reviews Genetics.

[40]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[41]  G. Getz,et al.  Inferring tumour purity and stromal and immune cell admixture from expression data , 2013, Nature Communications.

[42]  John D Lambris,et al.  Complement in cancer: untangling an intricate relationship , 2017, Nature Reviews Immunology.

[43]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[44]  G. Stark,et al.  Overexpression of kinesins mediates docetaxel resistance in breast cancer cells. , 2009, Cancer research.

[45]  Nancy R. Zhang,et al.  Bulk tissue cell type deconvolution with multi-subject single-cell expression reference , 2018, Nature Communications.