Computational analysis of cancer Next-Generation Sequencing data

Motivation: Sequencing of matched tumor and normal samples is the standard study design for reliable detection of somatic alterations. However, even very low levels of cross-sample contamination significantly impact calling of somatic mutations, because contaminant germline variants can be incorrectly interpreted as somatic. There are currently no sequence-only based methods that reliably estimate contamination levels in tumor samples, which frequently display copy number changes. As a solution, we developed Conpair, a tool for detection of sample swaps and crossindividual contamination in whole-genome and whole-exome tumor–normal sequencing experiments. Results: On a ladder of in silico contaminated samples, we demonstrated that Conpair reliably measures contamination levels as low as 0.1%, even in presence of copy number changes. We also estimated contamination levels in glioblastoma WGS and WXS tumor–normal datasets from TCGA and showed that they strongly correlate with tumor–normal concordance, as well as with the number of germline variants called as somatic by several widely-used somatic callers. Availability and Implementation: The method is available at: https://github.com/nygenome/ conpair. Contact: egrabowska@gmail.com or mczody@nygenome.org Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  R D Wright,et al.  MLL-AF9 and FLT3 cooperation in acute myelogenous leukemia: development of a model for rapid therapeutic assessment , 2008, Leukemia.

[2]  H. Koeffler,et al.  p21WAF1 mutations and human malignancies. , 1997, Leukemia & lymphoma.

[3]  N. Socci,et al.  Prognostic relevance of integrated genetic profiling in acute myeloid leukemia. , 2012, The New England journal of medicine.

[4]  Iannis Aifantis,et al.  DNA Hydroxymethylation Profiling Reveals That WT1 Mutations Result in Loss of TET2 Function in Acute Myeloid Leukemia , 2014 .

[5]  J. Kutok,et al.  Leukemogenic Ptpn11 causes fatal myeloproliferative disorder via cell-autonomous effects on multiple stages of hematopoiesis. , 2009, Blood.

[6]  D. Gilliland,et al.  Genetics of myeloid leukemias. , 2003, Annual review of genomics and human genetics.

[7]  Ming Yan,et al.  A previously unidentified alternatively spliced isoform of t(8;21) transcript promotes leukemogenesis , 2006, Nature Medicine.

[8]  O. Abdel-Wahab,et al.  Tet2 loss leads to increased hematopoietic stem cell self-renewal and myeloid transformation. , 2011, Cancer cell.

[9]  Feng-Chun Yang,et al.  Deletion of Tet2 in mice leads to dysregulated hematopoietic stem cells and subsequent development of myeloid malignancies. , 2011, Blood.

[10]  Shanmin Zhao,et al.  KRAS (G12D) Cooperates with AML1/ETO to Initiate a Mouse Model Mimicking Human Acute Myeloid Leukemia , 2014, Cellular Physiology and Biochemistry.

[11]  J. Licht,et al.  Leukemic IDH1 and IDH2 mutations result in a hypermethylation phenotype, disrupt TET2 function, and impair hematopoietic differentiation. , 2010, Cancer cell.

[12]  V. Vacic,et al.  Integrative genetic analysis of mouse and human AML identifies cooperating disease alleles , 2016, The Journal of experimental medicine.

[13]  L. Chin,et al.  Chromosomally unstable mouse tumours have genomic alterations similar to diverse human cancers , 2007, Nature.

[14]  M. Wunderlich,et al.  N-Ras(G12D) induces features of stepwise transformation in preleukemic human umbilical cord blood cultures expressing the AML1-ETO fusion gene. , 2011, Blood.

[15]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[16]  Han Fang,et al.  Indel variant analysis of short-read sequencing data with Scalpel , 2015 .

[17]  M. Loh,et al.  PTPN11 mutations in pediatric patients with acute myeloid leukemia: results from the Children's Cancer Group , 2004, Leukemia.

[18]  Anushya Muruganujan,et al.  PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees , 2012, Nucleic Acids Res..

[19]  W. El-Deiry,et al.  Repair Defect in p21 WAF1/CIP1 -/- human cancer cells. , 1996, Cancer research.

[20]  M. Lübbert,et al.  Complementing mutations in core binding factor leukemias: from mouse models to clinical applications , 2008, Oncogene.

[21]  Zhon-Yin Zhang,et al.  Molecular Basis of Gain-of-Function LEOPARD Syndrome-Associated SHP2 Mutations , 2014, Biochemistry.

[22]  Ming Yan,et al.  The p21Waf1 pathway is involved in blocking leukemogenesis by the t(8;21) fusion protein AML1-ETO. , 2007, Blood.

[23]  G. Nolan,et al.  Simultaneous fluorescence-activated cell sorter analysis of two distinct transcriptional elements within a single cell using engineered green fluorescent proteins. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Ming Yan,et al.  Acute myeloid leukemia with the 8q22;21q22 translocation: secondary mutational events and alternative t(8;21) transcripts. , 2007, Blood.

[25]  J. Downing,et al.  Expression of a conditional AML1-ETO oncogene bypasses embryonic lethality and establishes a murine model of human t(8;21) acute myeloid leukemia. , 2002, Cancer cell.