Leveraging sequences missing from the human genome to diagnose cancer

Cancer diagnosis using cell-free DNA (cfDNA) can significantly improve treatment and survival but has several technical limitations. Here, we show that tumor-associated mutations create neomers, DNA sequences 11-18bp in length that are absent in the human genome, that can accurately detect cancer subtypes and features. We show that we can detect twenty-one different tumor-types with higher accuracy than state-of-the-art methods using a neomer-based classifier. Refinement of this classifier via supervised learning identified additional cancer features with even greater precision. We also demonstrate that neomers can precisely diagnose cancer from cfDNA in liquid biopsy samples. Finally, we show that neomers can be used to detect cancer-associated non-coding mutations affecting gene regulatory activity. Combined, our results identify a novel, sensitive, specific and straightforward cancer diagnostic tool.

[1]  E. Larsson,et al.  Non-coding driver mutations in human cancer , 2021, Nature Reviews Cancer.

[2]  T. Becker,et al.  Human TERT promoter mutations as a prognostic biomarker in glioma , 2021, Journal of Cancer Research and Clinical Oncology.

[3]  N. Friedman,et al.  ChIP-seq of plasma cell-free nucleosomes identifies gene expression programs of the cells-of-origin , 2021, Nature Biotechnology.

[4]  D. Santoni,et al.  The farther the better: Investigating how distance from human self affects the propensity of a peptide to be presented on cell surface by MHC class I molecules, the case of Trypanosoma cruzi , 2020, PloS one.

[5]  M. Frith,et al.  Significant non-existence of sequences in genomes and proteomes , 2020, bioRxiv.

[6]  K. D. Sørensen,et al.  Epigenetic Analysis of Circulating Tumor DNA in Localized and Metastatic Prostate Cancer: Evaluation of Clinical Biomarker Potential , 2020, Cells.

[7]  F. Meric-Bernstam,et al.  Correlation of pathogenic POLE mutations with clinical benefit to immune checkpoint inhibitor therapy. , 2020 .

[8]  M. Speicher,et al.  Cell-Free DNA and Apoptosis: How Dead Cells Inform About the Living. , 2020, Trends in molecular medicine.

[9]  Hayden C. Metsky,et al.  Massively multiplexed nucleic acid detection with Cas13 , 2020, Nature.

[10]  D. Ledbetter,et al.  Feasibility of blood testing combined with PET-CT to screen for cancer and guide intervention , 2020, Science.

[11]  D. Santoni,et al.  In the search of potential epitopes for Wuhan seafood market pneumonia virus using high order nullomers , 2020, Journal of Immunological Methods.

[12]  P. V. van Dam,et al.  The art of obtaining a high yield of cell-free DNA from urine , 2020, PloS one.

[13]  S. Pal,et al.  Harnessing cell-free DNA: plasma circulating tumour DNA for liquid biopsy in genitourinary cancers , 2020, Nature Reviews Urology.

[14]  O. Barnea,et al.  Absent from DNA and protein: genomic characterization of nullomers and nullpeptides across functional categories and evolution , 2020, Genome Biology.

[15]  Nuno A. Fonseca,et al.  A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns , 2020, Nature Communications.

[16]  Steven J. M. Jones,et al.  Pan-cancer analysis of whole genomes , 2020, Nature.

[17]  Max J. Kellner,et al.  Author Correction: SHERLOCK: nucleic acid detection with CRISPR nucleases , 2020, Nature Protocols.

[18]  R. Khokha,et al.  A Four-Chemokine Signature Is Associated with a T-cell–Inflamed Phenotype in Primary and Metastatic Pancreatic Cancer , 2020, Clinical Cancer Research.

[19]  A. Jemal,et al.  Cancer statistics, 2020 , 2020, CA: a cancer journal for clinicians.

[20]  Y. Zimmer,et al.  A Comparative Analysis of Individual RAS Mutations in Cancer Biology , 2019, Front. Oncol..

[21]  David R. Jones,et al.  High-intensity sequencing reveals the sources of plasma circulating cell-free DNA variants , 2019, Nature Medicine.

[22]  M. Schimek,et al.  Inference of transcription factor binding from cell-free DNA enables tumor subtype prediction and early detection , 2019, Nature Communications.

[23]  R. Xu,et al.  Evaluation of POLE and POLD1 Mutations as Biomarkers for Immunotherapy Outcomes Across Multiple Cancer Types. , 2019, JAMA oncology.

[24]  Max J. Kellner,et al.  SHERLOCK: nucleic acid detection with CRISPR nucleases , 2019, Nature Protocols.

[25]  Xianrang Song,et al.  Saliva‐derived cfDNA is applicable for EGFR mutation detection but not for quantitation analysis in non‐small cell lung cancer , 2019, Thoracic cancer.

[26]  Hyun-Jin Kang,et al.  Small-Molecule-Targeting Hairpin Loop of hTERT Promoter G-Quadruplex Induces Cancer Cell Death. , 2019, Cell chemical biology.

[27]  R. Rosenquist,et al.  Cell‐free tumour DNA testing for early detection of cancer – a potential future tool , 2019, Journal of internal medicine.

[28]  G. Bubley,et al.  Low Abundance of Circulating Tumor DNA in Localized Prostate Cancer , 2019, bioRxiv.

[29]  Prashanth Rawla,et al.  Epidemiology of Prostate Cancer , 2019, World journal of oncology.

[30]  S. Holdenrieder,et al.  The emerging role of cell-free DNA as a molecular marker for cancer management , 2019, Biomolecular detection and quantification.

[31]  Ryan L. Collins,et al.  The mutational constraint spectrum quantified from variation in 141,456 humans , 2020, Nature.

[32]  N. Hawkes Cancer survival data emphasise importance of early diagnosis , 2019, British Medical Journal.

[33]  H. Lenz,et al.  Microsatellite instability in colorectal cancer: overview of its clinical significance and novel perspectives. , 2018, Clinical advances in hematology & oncology : H&O.

[34]  Alessandro Sette,et al.  The Immune Epitope Database (IEDB): 2018 update , 2018, Nucleic Acids Res..

[35]  D. Hanahan,et al.  Pan-Cancer Landscape of Aberrant DNA Methylation across Human Tumors. , 2018, Cell reports.

[36]  N. Murray Minimal residual disease in prostate cancer patients after primary treatment: theoretical considerations, evidence and possible use in clinical management , 2018, Biological Research.

[37]  Mai-Britt Worm Ørntoft,et al.  Review of Blood‐Based Colorectal Cancer Screening: How Far Are Circulating Cell‐Free DNA Methylation Markers From Clinical Implementation? , 2018, Clinical colorectal cancer.

[38]  M. Ladomery,et al.  Hypoxia leads to significant changes in alternative splicing and elevated expression of CLK splice factor kinases in PC3 prostate cancer cells , 2018, BMC Cancer.

[39]  S. Mortimer,et al.  The Landscape of Actionable Genomic Alterations in Cell-Free Circulating Tumor DNA from 21,807 Advanced Cancer Patients , 2017, Clinical Cancer Research.

[40]  Daniele Santoni,et al.  Nullomers and High Order Nullomers in Genomic Sequences , 2016, PloS one.

[41]  I. Yeh Faculty Opinions recommendation of Cancer. The transcription factor GABP selectively binds and activates the mutant TERT promoter in cancer. , 2016 .

[42]  David C. Jones,et al.  Landscape of somatic mutations in 560 breast cancer whole genome sequences , 2016, Nature.

[43]  N. Ahituv,et al.  Decoding enhancers using massively parallel reporter assays. , 2015, Genomics.

[44]  R. C. Poulos,et al.  The search for cis-regulatory driver mutations in cancer genomes , 2015, Oncotarget.

[45]  Chibo Hong,et al.  The transcription factor GABP selectively binds and activates the mutant TERT promoter in cancer , 2015, Science.

[46]  G. Samimi,et al.  Methylation of cell-free circulating DNA in the diagnosis of cancer , 2015, Front. Mol. Biosci..

[47]  Emily Kang,et al.  Cancer-Associated Protein Kinase C Mutations Reveal Kinase’s Role as Tumor Suppressor , 2015, Cell.

[48]  Robert J. Schmitz,et al.  Methylated DNA is over-represented in whole-genome bisulfite sequencing data , 2014, Front. Genet..

[49]  S. Srikantan,et al.  The tumor susceptibility gene TMEM127 is mutated in renal cell carcinomas and modulates endolysosomal function. , 2014, Human molecular genetics.

[50]  K. Hemminki,et al.  TERT promoter mutations in cancer development. , 2014, Current opinion in genetics & development.

[51]  Kenneth K Wang,et al.  Genome-wide methylation analysis shows similar patterns in Barrett's esophagus and esophageal adenocarcinoma. , 2013, Carcinogenesis.

[52]  David T. W. Jones,et al.  Signatures of mutational processes in human cancer , 2013, Nature.

[53]  Miguel Melo,et al.  Frequency of TERT promoter mutations in human cancers , 2013, Nature Communications.

[54]  D. Taub,et al.  Differential G protein subunit expression by prostate cancer cells and their interaction with CXCR5 , 2013, Molecular Cancer.

[55]  Raymond K. Auerbach,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[56]  Carla Mattos,et al.  A comprehensive survey of Ras mutations in cancer. , 2012, Cancer research.

[57]  S. Meakin,et al.  Role and expression of FRS2 and FRS3 in prostate cancer , 2011, BMC Cancer.

[58]  J. Eshleman,et al.  Genome-Wide Analysis of Promoter Methylation Associated with Gene Expression Profile in Pancreatic Adenocarcinoma , 2011, Clinical Cancer Research.

[59]  M. Stearns,et al.  Evidence for Prostate Cancer-Associated Diagnostic Marker-1 , 2004, Clinical Cancer Research.

[60]  Ruth Etzioni,et al.  Early detection: The case for early detection , 2003, Nature Reviews Cancer.

[61]  M. Barry Prostate-Specific–Antigen Testing for Early Diagnosis of Prostate Cancer , 2001 .

[62]  R. Taichman,et al.  Minimal Residual Disease in Prostate Cancer. , 2018, Advances in experimental medicine and biology.

[63]  W. Chao,et al.  Alteration Of Insulin-Like Growth Factor-1 Expression Following The Middle Cerebral Artery Occlusion in Monkeys And Rats : cDNA Microarray , Immunohistochemistry and in Situ Hybridization Studies , 2006 .