Engineered in-vitro cell line mixtures and robust evaluation of computational methods for clonal decomposition and longitudinal dynamics in cancer

Characterization and quantification of tumour clonal populations over time via longitudinal sampling are essential components in understanding and predicting the response to therapeutic interventions. Computational methods for inferring tumour clonal composition from deep-targeted sequencing data are ubiquitous, however due to the lack of a ground truth biological data, evaluating their performance is difficult. In this work, we generate a benchmark data set that simulates tumour longitudinal growth and heterogeneity by in vitro mixing of cancer cell lines with known proportions. We apply four different algorithms to our ground truth data set and assess their performance in inferring clonal composition using different metrics. We also analyse the performance of these algorithms on breast tumour xenograft samples. We conclude that methods that can simultaneously analyse multiple samples while accounting for copy number alterations as a factor in allelic measurements exhibit the most accurate predictions. These results will inform future functional genomics oriented studies of model systems where time series measurements in the context of therapeutic interventions are becoming increasingly common. These studies will need computational models which accurately reflect the multi-factorial nature of allele measurement in cancer including, as we show here, segmental aneuploidies.

[1]  A. Bouchard-Côté,et al.  PyClone: statistical inference of clonal population structure in cancer , 2014, Nature Methods.

[2]  Gabor T. Marth,et al.  SubcloneSeeker: a computational framework for reconstructing tumor clone structure for cancer variant interpretation and prioritization , 2014, Genome Biology.

[3]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[4]  Gholamreza Haffari,et al.  Feature-based classifiers for somatic mutation detection in tumour–normal paired sequencing data , 2011, Bioinform..

[5]  Julia Hirschberg,et al.  V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure , 2007, EMNLP.

[6]  Junfeng Wang,et al.  Inferring Clonal Composition from Multiple Sections of a Breast Cancer , 2014, PLoS Comput. Biol..

[7]  Nilgun Donmez,et al.  Clonality inference in multiple tumor samples using phylogeny , 2015, Bioinform..

[8]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[9]  Y. Kluger,et al.  TrAp: a tree approach for fingerprinting subclonal tumor composition , 2013, Nucleic acids research.

[10]  Obi L. Griffith,et al.  SciClone: Inferring Clonal Architecture and Tracking the Spatial and Temporal Patterns of Tumor Evolution , 2014, PLoS Comput. Biol..

[11]  M. Brattain,et al.  Heterogeneity of malignant cells from a human colonic carcinoma. , 1981, Cancer research.

[12]  Christopher J. R. Illingworth,et al.  High-Definition Reconstruction of Clonal Composition in Cancer , 2014, Cell reports.

[13]  Shankar Vembu,et al.  Inferring clonal evolution of tumors from single nucleotide somatic mutations , 2012, BMC Bioinformatics.

[14]  Anne-Marie Mes-Masson,et al.  Derivation and characterization of matched cell lines from primary and recurrent serous ovarian cancer , 2012, BMC Cancer.

[15]  Joshua F. McMichael,et al.  Clonal evolution in relapsed acute myeloid leukemia revealed by whole genome sequencing , 2011, Nature.

[16]  Sohrab P. Shah,et al.  Dynamics of genomic clones in breast cancer patient xenografts at single-cell resolution , 2014, Nature.

[17]  Iman Hajirasouliha,et al.  A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data , 2014, Bioinform..

[18]  Shankar Vembu,et al.  PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors , 2015, Genome Biology.

[19]  Ali Bashashati,et al.  Histological Transformation and Progression in Follicular Lymphoma: A Clonal Evolution Study , 2016, PLoS medicine.

[20]  Christopher Yau,et al.  OncoSNP-SEQ: a statistical approach for the identification of somatic copy number alterations from next-generation sequencing of cancer genomes , 2013, Bioinform..

[21]  Carlos Caldas,et al.  A co-culture genome-wide RNAi screen with mammary epithelial cells reveals transmembrane signals required for growth and differentiation , 2015, Breast Cancer Research.

[22]  Iman Hajirasouliha,et al.  Fast and scalable inference of multi-sample cancer lineages , 2014, Genome Biology.