Breast cancer histopathology image analysis pipeline for tumor purity estimation

The translation of genomic sequencing technology to the clinic has greatly advanced personalized medicine. However, the presence of normal cells in tumors is a confounding factor in genome sequence analysis. Tumor purity, or the percentage of cancerous cells in whole tissue section, is a correction factor that can be used to improve the clinical utility of genomic sequencing. Currently, tumor purity is estimated visually by expert pathologists; however, it has been shown that there exist vast inter-observer discrepancies in tumor purity scoring. In this paper, we propose a quantitative image analysis pipeline for tumor purity estimation and provide a systematic comparison between pathologists' scores and our image-based tumor purity estimation.

[1]  A. Butte,et al.  Systematic pan-cancer analysis of tumour purity , 2015, Nature Communications.

[2]  F. Markowetz,et al.  Quantitative Image Analysis of Cellular Heterogeneity in Breast Tumors Complements Genomic Profiling , 2012, Science Translational Medicine.

[3]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[4]  Vahid Azimi,et al.  Quantitative analysis of histological tissue image based on cytological profiles and spatial statistics , 2016, 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[5]  S. Gabriel,et al.  Pan-cancer patterns of somatic copy-number alteration , 2013, Nature Genetics.

[6]  S. Nahas,et al.  Impact of TP53 mutation variant allele frequency on phenotype and outcomes in myelodysplastic syndromes , 2016, Leukemia.

[7]  Benjamin J. Raphael,et al.  THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data , 2013, Genome Biology.

[8]  E. Padron,et al.  Integrating mutation variant allele frequency into clinical practice in myeloid malignancies. , 2016, Hematology/oncology and stem cell therapy.

[9]  Luc Girard,et al.  An integrated view of copy number and allelic alterations in the cancer genome using single nucleotide polymorphism arrays. , 2004, Cancer research.

[10]  Ansuman Chattopadhyay,et al.  Variant allele frequency enrichment analysis in vitro reveals sonic hedgehog pathway to impede sustained temozolomide response in GBM , 2015, Scientific Reports.

[11]  Stefan M Willems,et al.  The estimation of tumor cell percentage for molecular testing by pathologists is not accurate , 2014, Modern Pathology.

[12]  S. Kasif,et al.  Identification of rare germline copy number variations over-represented in five human cancer types , 2015, Molecular Cancer.

[13]  Gilles Louppe,et al.  Collaborative analysis of multi-gigapixel imaging data using Cytomine , 2016, Bioinform..

[14]  Obi L. Griffith,et al.  Convergent loss of PTEN leads to clinical resistance to a PI3Kα inhibitor , 2014, Nature.