Statistical Inference From Stem Cell Barcoding Data Using Adaptive Approximate Bayesian Computation

Background: Barcodes that can be supplied to cells by transduction of a library of unique DNA sequences allow identification of heterogeneity in cell populations and lineage tracing applications. Estimation of the number of hematopoietic stem cell (HSC) clones is important since it also allows to approximate the number of hematopoietic stem cells from which the circulating blood cells descend. This problem is similar to the species problem, well-known to ecologists. However, an additional ”degree of freedom” exists, since different HSC generally give rise to clones with different growth rates. This adds credibility to sampling models based on different versions of Dirichlet-multinomial distributions. Results: We developed a truncated population approximate Bayesian computation (ABC) algorithm which is derived from sequential Monte Carlo ABC (SMC-ABC) and applied the method to the symmetric Dirichlet-multinomial model proposed by Zhang et al. (2005) and asymmetric Dirichlet-multinomial model we proposed. Methodology was tested using simulated and real-life data. Conclusions: Results suggest that flexibility of the asymmetric Dirichlet-multinomial helps to obtain insight into heterogeneity of proliferating cell systems such as HSC. Estimates based on experimental data approach the correct count of murine HSC.