Model-based tumor subclonal reconstruction

The vast majority of cancer next-generation sequencing data consist of bulk samples composed of mixtures of cancer and normal cells. To study tumor evolution, subclonal reconstruction approaches based on machine learning are used to separate subpopulation of cancer cells and reconstruct their ancestral relationships. However, current approaches are entirely data-driven and agnostic to evolutionary theory. We demonstrate that systematic errors occur in subclonal reconstruction if tumor evolution is not accounted for, and that those errors increase when multiple samples are taken from the same tumor. To address this issue, we present a novel approach for model-based subclonal reconstruction that combines data-driven machine learning with evolutionary theory. Using public, synthetic and newly generated data, we show the method is more robust and accurate than current techniques in both single-sample and multi-region sequencing data. With careful data curation and interpretation, we show how the method allows minimizing the confounding factors that affect non-evolutionary methods, leading to a more accurate recovery of the evolutionary history of human tumors.

[1]  Lincoln D. Stein,et al.  PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors , 2014, Genome Biology.

[2]  Bartlomiej Waclaw,et al.  On measuring selection in cancer from subclonal mutation frequencies , 2019, bioRxiv.

[3]  Xiaohong Helena Yang,et al.  A New Model T on the Horizon? , 2017, Cell.

[4]  M. Stratton,et al.  Universal Patterns of Selection in Cancer and Somatic Tissues , 2018, Cell.

[5]  Lurias,et al.  MUTATIONS OF BACTERIA FROM VIRUS SENSITIVITY TO VIRUS RESISTANCE’-’ , 2003 .

[6]  K. Kinzler,et al.  Cancer Genome Landscapes , 2013, Science.

[7]  David C Wedge,et al.  Principles of Reconstructing the Subclonal Architecture of Cancers. , 2017, Cold Spring Harbor perspectives in medicine.

[8]  Giulio Caravagna,et al.  Detecting repeated cancer evolution from multi-region tumor sequencing data , 2018, Nature Methods.

[9]  Nicolai J. Birkbak,et al.  Tracking the Evolution of Non‐Small‐Cell Lung Cancer , 2017, The New England journal of medicine.

[10]  P. Lønning,et al.  Genomic Evolution of Breast Cancer Metastasis and Relapse , 2017, Cancer cell.

[11]  Marc J. Williams,et al.  Quantification of subclonal selection in cancer from bulk sequencing data , 2018, Nature Genetics.

[12]  Michael I. Jordan,et al.  Tree-Structured Stick Breaking for Hierarchical Data , 2010, NIPS.

[13]  A. Børresen-Dale,et al.  The Life History of 21 Breast Cancers , 2012, Cell.

[14]  D. Hartl,et al.  Principles of population genetics , 1981 .

[15]  Andrea Sottoriva,et al.  Measuring cancer evolution from the genome , 2017, The Journal of pathology.

[16]  Herbert Levine,et al.  Scaling Solution in the Large Population Limit of the General Asymmetric Stochastic Luria–Delbrück Evolution Process , 2014, Journal of statistical physics.

[17]  Giulio Caravagna,et al.  Reply to ‘Neutral tumor evolution?’ , 2018, Nature Genetics.

[18]  Ville Mustonen,et al.  The evolutionary landscape of colorectal tumorigenesis , 2018, Nature Ecology & Evolution.

[19]  Herbert Levine,et al.  Large population solution of the stochastic Luria–Delbrück evolution model , 2013, Proceedings of the National Academy of Sciences.

[20]  Jian-Bo Yang,et al.  Past, Present and the Future , 1998 .

[21]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Joshua F. McMichael,et al.  Optimizing cancer genome sequencing and analysis. , 2015, Cell systems.

[23]  Andrea Sottoriva,et al.  Between-Region Genetic Divergence Reflects the Mode and Tempo of Tumor Evolution , 2017, Nature Genetics.

[24]  Yuan Ji,et al.  Portraits of genetic intra-tumour heterogeneity and subclonal selection across cancer types , 2018, bioRxiv.

[25]  N. Navin,et al.  The first five years of single-cell cancer genomics and beyond , 2015, Genome research.

[26]  Nathan M. Wilson,et al.  Creating Standards for Evaluating Tumour Subclonal Reconstruction , 2018, bioRxiv.

[27]  Katevan Chkhaidze,et al.  Spatially constrained tumour growth affects the patterns of clonal selection and neutral drift in cancer genomic data , 2019, PLoS Comput. Biol..

[28]  Johan Hartman,et al.  Chemoresistance Evolution in Triple-Negative Breast Cancer Delineated by Single-Cell Sequencing , 2018, Cell.

[29]  Hans Clevers,et al.  Intra-tumour diversification in colorectal cancer at the single-cell level , 2018, Nature.

[30]  J. Reis-Filho,et al.  Pan-cancer analysis of intratumor heterogeneity as a prognostic determinant of survival , 2016, Oncotarget.

[31]  D. Holdstock Past, present--and future? , 2005, Medicine, conflict, and survival.

[32]  Michael D. Nicholson,et al.  Universal Asymptotic Clone Size Distribution for General Population Growth , 2016, Bulletin of mathematical biology.

[33]  N. Navin,et al.  Clonal Evolution in Breast Cancer Revealed by Single Nucleus Genome Sequencing , 2014, Nature.

[34]  Martin H. Schaefer,et al.  Negative selection in tumor genome evolution acts on essential cellular functions and the immunopeptidome , 2018, Genome Biology.

[35]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[36]  Irmtraud M. Meyer,et al.  The clonal and mutational evolution spectrum of primary triple-negative breast cancers , 2012, Nature.

[37]  Michael Lynch,et al.  Genetic drift, selection and the evolution of the mutation rate , 2016, Nature Reviews Genetics.

[38]  Funda Meric-Bernstam,et al.  Punctuated Copy Number Evolution and Clonal Stasis in Triple-Negative Breast Cancer , 2016, Nature Genetics.

[39]  Andrew Menzies,et al.  Subclonal diversification of primary breast cancer revealed by multiregion sequencing , 2015, Nature Medicine.

[40]  A. Bouchard-Côté,et al.  PyClone: statistical inference of clonal population structure in cancer , 2014, Nature Methods.

[41]  Alexander R. A. Anderson,et al.  Evolutionary dynamics of neoantigens in growing tumours , 2019, bioRxiv.

[42]  S. Tavaré,et al.  The age of a mutation in a general coalescent tree , 1998 .

[43]  Rick Durrett,et al.  POPULATION GENETICS OF NEUTRAL MUTATIONS IN EXPONENTIALLY GROWING CANCER CELL POPULATIONS. , 2013, The annals of applied probability : an official journal of the Institute of Mathematical Statistics.

[44]  Arne Leijon,et al.  Bayesian Estimation of Beta Mixture Models with Variational Inference , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Sven Rahmann,et al.  A hybrid parameter estimation algorithm for beta mixtures and applications to methylation state classification , 2016, Algorithms for Molecular Biology.

[46]  Oskar Hallatschek,et al.  Excess of mutational jackpot events in expanding populations revealed by spatial Luria–Delbrück experiments , 2016, Nature Communications.

[47]  Obi L. Griffith,et al.  SciClone: Inferring Clonal Architecture and Tracking the Spatial and Temporal Patterns of Tumor Evolution , 2014, PLoS Comput. Biol..

[48]  Marc J. Williams,et al.  Identification of neutral tumor evolution across cancer types , 2016, Nature Genetics.

[49]  D. Gillespie Exact Stochastic Simulation of Coupled Chemical Reactions , 1977 .

[50]  M. Nykter,et al.  The Evolutionary History of Lethal Metastatic Prostate Cancer , 2015, Nature.

[51]  Shankar Vembu,et al.  PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors , 2015, Genome Biology.

[52]  N. McGranahan,et al.  Clonal Heterogeneity and Tumor Evolution: Past, Present, and the Future , 2017, Cell.

[53]  J. Salk Clonal evolution in cancer , 2010 .

[54]  Hanlee P. Ji,et al.  Pan-cancer analysis of the extent and consequences of intratumor heterogeneity , 2015, Nature Medicine.