eTumorType, An Algorithm of Discriminating Cancer Types for Circulating Tumor Cells or Cell-free DNAs in Blood

With the technology development on detecting circulating tumor cells (CTCs) and cell-free DNAs (cfDNAs) in blood, serum, and plasma, non-invasive diagnosis of cancer becomes promising. A few studies reported good correlations between signals from tumor tissues and CTCs or cfDNAs, making it possible to detect cancers using CTCs and cfDNAs. However, the detection cannot tell which cancer types the person has. To meet these challenges, we developed an algorithm, eTumorType, to identify cancer types based on copy number variations (CNVs) of the cancer founding clone. eTumorType integrates cancer hallmark concepts and a few computational techniques such as stochastic gradient boosting, voting, centroid, and leading patterns. eTumorType has been trained and validated on a large dataset including 18 common cancer types and 5327 tumor samples. eTumorType produced high accuracies (0.86–0.96) and high recall rates (0.79–0.92) for predicting colon, brain, prostate, and kidney cancers. In addition, relatively high accuracies (0.78–0.92) and recall rates (0.58–0.95) have also been achieved for predicting ovarian, breast luminal, lung, endometrial, stomach, head and neck, leukemia, and skin cancers. These results suggest that eTumorType could be used for non-invasive diagnosis to determine cancer types based on CNVs of CTCs and cfDNAs.

[1]  S. Gabriel,et al.  Pan-cancer patterns of somatic copy-number alteration , 2013, Nature Genetics.

[2]  Edwin Wang,et al.  Understanding genomic alterations in cancer genomes using an integrative network approach. , 2013, Cancer letters.

[3]  N. Rosenfeld,et al.  Non-invasive analysis of acquired resistance to cancer therapy by sequencing of plasma DNA , 2013, Nature.

[4]  Zhihai Peng,et al.  CD133+CD54+CD44+ circulating tumor cells as a biomarker of treatment selection and liver metastasis in patients with colorectal cancer , 2016, Oncotarget.

[5]  Oleksii Nikolaienko,et al.  Concentration and Methylation of Cell-Free DNA from Blood Plasma as Diagnostic Markers of Renal Cancer , 2016, Disease markers.

[6]  Ash A. Alizadeh,et al.  An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage , 2013, Nature Medicine.

[7]  Gudrun Schleiermacher,et al.  Genomic Copy Number Profiling Using Circulating Free Tumor DNA Highlights Heterogeneity in Neuroblastoma , 2016, Clinical Cancer Research.

[8]  Klaus Pantel,et al.  Diagnostic and prognostic relevance of circulating exosomal miR-373, miR-200a, miR-200b and miR-200c in patients with epithelial ovarian cancer , 2016, Oncotarget.

[9]  Ali Bashashati,et al.  Distinct evolutionary trajectories of primary high-grade serous ovarian cancers revealed through spatial mutational profiling , 2013, The Journal of pathology.

[10]  Jinfeng Zou,et al.  Identification and Construction of Combinatory Cancer Hallmark-Based Gene Signature Sets to Predict Recurrence and Chemotherapy Benefit in Stage II Colorectal Cancer. , 2016, JAMA oncology.

[11]  C. Lim,et al.  Isolation and retrieval of circulating tumor cells using centrifugal forces , 2013, Scientific Reports.

[12]  Rachael P. Huntley,et al.  QuickGO: a web-based tool for Gene Ontology searching , 2009, Bioinform..

[13]  Edwin Wang,et al.  Cancer systems biology in the genome sequencing era: part 1, dissecting and modeling of tumor clones and their networks. , 2013, Seminars in cancer biology.

[14]  E. Wang,et al.  Predictive genomics: a cancer hallmark network framework for predicting tumor clinical phenotypes using genome sequencing data. , 2014, Seminars in cancer biology.

[15]  Michael J. Powell,et al.  Genetic alteration and mutation profiling of circulating cell-free tumor DNA (cfDNA) for diagnosis and targeted therapy of gastrointestinal stromal tumors , 2016, Chinese journal of cancer.

[16]  Obi L. Griffith,et al.  SciClone: Inferring Clonal Architecture and Tracking the Spatial and Temporal Patterns of Tumor Evolution , 2014, PLoS Comput. Biol..

[17]  A. McKenna,et al.  Absolute quantification of somatic DNA alterations in human cancer , 2012, Nature Biotechnology.

[18]  Benjamin J. Raphael,et al.  Pan-Cancer Network Analysis Identifies Combinations of Rare Somatic Mutations across Pathways and Protein Complexes , 2014, Nature Genetics.

[19]  Q. Cui,et al.  Identification of high-quality cancer prognostic markers and metastasis network modules , 2010, Nature communications.

[20]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[21]  Mark Culp,et al.  and Development , 1998 .

[22]  Matthew Rabinowitz,et al.  Detection of Clonal and Subclonal Copy-Number Variants in Cell-Free DNA from Patients with Breast Cancer Using a Massively Multiplexed PCR Methodology , 2015, Translational oncology.

[23]  A. Børresen-Dale,et al.  The Life History of 21 Breast Cancers , 2012, Cell.

[24]  Edwin Wang,et al.  Signaling network assessment of mutations and copy number variations predict breast cancer subtype-specific drug targets. , 2013, Cell reports.

[25]  Jorge S. Reis-Filho,et al.  Going with the Flow: From Circulating Tumor Cells to DNA , 2013, Science Translational Medicine.

[26]  Aviv Regev,et al.  Whole exome sequencing of circulating tumor cells provides a window into metastatic prostate cancer , 2014, Nature Biotechnology.

[27]  Jaafar Bennouna,et al.  Plasma is a better source of tumor-derived circulating cell-free DNA than serum for the detection of EGFR alterations in lung tumor patients. , 2013, Lung cancer.

[28]  Sabine Riethdorf,et al.  High‐resolution analyses of copy number changes in disseminated tumor cells of patients with breast cancer , 2012, International journal of cancer.

[29]  Bert Vogelstein,et al.  DETECTION OF CIRCULATING TUMOR DNA IN EARLY AND LATE STAGE HUMAN MALIGNANCIES , 2014 .

[30]  J. Mesirov,et al.  GenePattern 2.0 , 2006, Nature Genetics.

[31]  Peter Ulz,et al.  Complex tumor genomes inferred from single circulating tumor cells by array-CGH and next-generation sequencing. , 2013, Cancer research.

[32]  B. Coe,et al.  DNA amplification is a ubiquitous mechanism of oncogene activation in lung and other cancers , 2008, Oncogene.

[33]  Juancarlos Chan,et al.  Gene Ontology Consortium: going forward , 2014, Nucleic Acids Res..

[34]  Nuria Jordana Ariza,et al.  Usefulness of circulating free DNA for monitoring epidermal growth factor receptor mutations in advanced non-small cell lung cancer patients: a case report. , 2016, Translational lung cancer research.

[35]  Marius Ilie,et al.  Current challenges for detection of circulating tumor cells and cell-free circulating nucleic acids, and their characterization in non-small cell lung carcinoma patients. What is the best blood substrate for personalized medicine? , 2014, Annals of translational medicine.

[36]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[37]  Koichi Goto,et al.  Epidermal Growth Factor Receptor Mutation Status in Circulating Free DNA in Serum: From IPASS, a Phase III Study of Gefitinib or Carboplatin/Paclitaxel in Non-small Cell Lung Cancer , 2012, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[38]  Edwin Wang,et al.  Cancer systems biology in the genome sequencing era: part 2, evolutionary dynamics of tumor clonal networks and drug resistance. , 2013, Seminars in cancer biology.

[39]  Caroline Seynaeve,et al.  Gene expression profiles of circulating tumor cells versus primary tumors in metastatic breast cancer. , 2015, Cancer letters.

[40]  G. Getz,et al.  GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers , 2011, Genome Biology.