Long-read sequencing of diagnosis and post-therapy medulloblastoma reveals complex rearrangement patterns and epigenetic signatures

Cancer genomes harbor a broad spectrum of structural variants (SV) driving tumorigenesis, a relevant subset of which are likely to escape discovery in short reads. We employed Oxford Nanopore Technologies (ONT) sequencing in a paired diagnostic and post-therapy medulloblastoma to unravel the haplotype-resolved somatic genetic and epigenetic landscape. We assemble complex rearrangements and such associated with telomeric sequences, including a 1.55 Megabasepair chromothripsis event. We uncover a complex SV pattern termed ‘templated insertion thread’, characterized by short (mostly <1kb) insertions showing prevalent self-concatenation into highly amplified structures of up to 50kbp in size. Templated insertion threads occur in 3% of cancers, with a prevalence ranging to 74% in liposarcoma, and frequent colocalization with chromothripsis. We also perform long-read based methylome profiling and discover allele-specific methylation (ASM) effects, complex rearrangements exhibiting differential methylation, and differential promoter methylation in seven cancer-driver genes. Our study shows the potential of long-read sequencing in cancer. Graphical abstract I) We investigate a single patient with chromothriptic sonic hedgehog medulloblastoma (Li-Fraumeni syndrome), with tissue samples taken from blood, the primary tumor at diagnosis, and a post-treatment (relapse) tumor. II) Data on the three samples has been collected from four sources, 1) Illumina whole-genome, 2) Illumina transcriptome sequencing, 3) Illumina Infinium HumanMethylation450k, as well as 4) long-read whole-genome sequencing using Oxford Nanopore Technologies (ONT) sequencing. III) An integrative analysis combines genomic, epigenomic as well as transcriptomic data to provide a comprehensive analysis of this heavily rearranged tumor sample. Long and short read sequencing data is used to inform the analysis of complex structural genomic variants and methylation called from haplotyped ONT reads and validated through the methylation array data allows for a haplotype-resolved study of genomic and epigenomic variation, which can then be examined for transcriptional effect. IV) This integrative analysis allows us to identify a large number of inter- and intra-chromosomal genomic rearrangements (A) including a complex rearrangement pattern we term templated insertion threads (B), as well as sample-specific and haplotype specific methylation patterns of known cancer genes (C).

[1]  William T. Harvey,et al.  Recurrent inversion polymorphisms in humans associate with genetic instability and genomic disorders , 2022, Cell.

[2]  M. Bonder,et al.  pycoMeth: a toolbox for differential methylation testing from Nanopore methylation calls , 2022, bioRxiv.

[3]  J. Korbel,et al.  Chromothripsis followed by circular recombination drives oncogene amplification in human cancer , 2021, Nature Genetics.

[4]  A. Suzuki,et al.  Application of long-read sequencing to the detection of structural variants in human cancer genomes , 2021, Computational and structural biotechnology journal.

[5]  M. Kool,et al.  Carbon ion radiotherapy eradicates medulloblastomas with chromothripsis in an orthotopic Li-Fraumeni patient-derived mouse model , 2021, Neuro-oncology.

[6]  Aaron M. Streets,et al.  The complete sequence of a human genome , 2021, bioRxiv.

[7]  Julie M. Behr,et al.  Loose ends in cancer genome structure , 2021, bioRxiv.

[8]  W. Fang,et al.  Integrated Analysis of Nine Prognostic RNA-Binding Proteins in Soft Tissue Sarcoma , 2021, Frontiers in Oncology.

[9]  J. Herman,et al.  Methylation of NRN1 is a novel synthetic lethal marker of PI3K‐Akt‐mTOR and ATR inhibitors in esophageal cancer , 2021, Cancer science.

[10]  A. Fujimoto,et al.  Whole-genome sequencing with long reads reveals complex structure and origin of structural variation in human genetic variations and somatic mutations in cancer , 2021, Genome Medicine.

[11]  Steven J. M. Jones,et al.  The transcriptional landscape of Shh medulloblastoma , 2021, Nature Communications.

[12]  William T. Harvey,et al.  Haplotype-resolved diverse human genomes and integrated analysis of structural variation , 2021, Science.

[13]  S. Fröhling,et al.  Accurate and efficient detection of gene fusions from RNA sequencing data , 2021, Genome research.

[14]  D. Baird,et al.  Tracking telomere fusions through crisis reveals conflict between DNA transcription and the DNA damage response , 2021, NAR cancer.

[15]  James K. Bonfield,et al.  HTSlib: C library for reading/writing high-throughput sequencing data , 2020, bioRxiv.

[16]  Daniel L. Cameron,et al.  Unscrambling cancer genomes via integrated analysis of structural variation and copy number , 2020, bioRxiv.

[17]  David A. Knowles,et al.  Distinct Classes of Complex Structural Variation Uncovered across Thousands of Cancer Genome Graphs , 2020, Cell.

[18]  M. Parker,et al.  A structural view of PA2G4 isoforms with opposing functions in cancer , 2020, The Journal of Biological Chemistry.

[19]  S. Fröhling,et al.  The landscape of chromothripsis across adult cancer types , 2020, Nature Communications.

[20]  Colleen M. Bosworth,et al.  Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes , 2020, Nature Biotechnology.

[21]  Philip D. Tatman,et al.  Isolation and analysis of rereplicated DNA by Rerep-Seq , 2020, Nucleic acids research.

[22]  David T. W. Jones,et al.  Genomic footprints of activated telomere maintenance mechanisms in cancer , 2020, Nature Communications.

[23]  Ken Chen,et al.  Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing , 2018, Nature Genetics.

[24]  Nuno A. Fonseca,et al.  Patterns of somatic structural variation in human cancer genomes , 2020, Nature.

[25]  Steven J. M. Jones,et al.  Pan-cancer analysis of whole genomes , 2020, Nature.

[26]  D. Torrents,et al.  Extrachromosomal circular DNA drives oncogenic genome remodeling in neuroblastoma , 2019, Nature Genetics.

[27]  Jiang Qian,et al.  EnhancerAtlas 2.0: an updated resource with enhancer annotation in 586 tissue/cell types across nine species , 2019, Nucleic Acids Res..

[28]  Ryan E. Mills,et al.  Structural variation in the sequencing era , 2019, Nature Reviews Genetics.

[29]  Kim Judge,et al.  MECHANISMS GENERATING CANCER GENOME COMPLEXITY FROM A SINGLE CELL DIVISION ERROR , 2019, Science.

[30]  S. Madhavan,et al.  Allele-specific DNA methylation is increased in cancers and its dense mapping in normal plus neoplastic cells increases the yield of disease-associated regulatory SNPs , 2019, bioRxiv.

[31]  Matthew R. Robinson,et al.  Accurate, scalable and integrative haplotype estimation , 2019, Nature Communications.

[32]  C. Benner,et al.  Identification and dynamic quantification of regulatory elements using total RNA , 2019, Genome Research.

[33]  S. Sleijfer,et al.  Pan-cancer whole-genome analyses of metastatic solid tumours , 2019, Nature.

[34]  Ashley Sanders,et al.  VISOR: a versatile haplotype-aware structural variant simulator for short- and long-read sequencing , 2019, Bioinform..

[35]  Christopher D. Brown,et al.  The GTEx Consortium atlas of genetic regulatory effects across human tissues , 2019, Science.

[36]  Yoshitaka Sakamoto,et al.  A new era of long-read sequencing for cancer genomics , 2019, Journal of Human Genetics.

[37]  Dongyan Huang,et al.  Aberrant promoter methylation reduced the expression of protocadherin 17 in nasopharyngeal cancer. , 2019, Biochemistry and cell biology = Biochimie et biologie cellulaire.

[38]  Alexandre H. Thiery,et al.  NanoVar: accurate characterization of patients’ genomic structural variants using low-depth nanopore sequencing , 2019, Genome Biology.

[39]  Alexander Hoischen,et al.  Long-Read Sequencing Emerging in Medical Genetics , 2019, Front. Genet..

[40]  Ryan L. Collins,et al.  Multi-platform discovery of haplotype-resolved structural variation in human genomes , 2017, Nature Communications.

[41]  P. Sachs,et al.  SMARCAD1 ATPase activity is required to silence endogenous retroviruses in embryonic stem cells , 2019, Nature Communications.

[42]  Benjamin J. Raphael,et al.  Reconstruction of clone- and haplotype-specific cancer genome karyotypes from bulk tumor samples , 2019, bioRxiv.

[43]  Heng Li,et al.  Fast and accurate long-read assembly with wtdbg2 , 2019, Nature Methods.

[44]  Benjamin J. Raphael,et al.  Accurate quantification of copy-number aberrations and whole-genome duplications in multi-sample tumor sequencing data , 2018, Nature Communications.

[45]  Jan O. Korbel,et al.  Alfred: interactive multi-sample BAM alignment statistics, feature counting and feature annotation for long- and short-read sequencing , 2018, Bioinform..

[46]  J. Rutka,et al.  A functional genomics approach to identify pathways of drug resistance in medulloblastoma , 2018, Acta neuropathologica communications.

[47]  J. Laco,et al.  Aberrant methylation of PCDH17 gene in high-grade serous ovarian carcinoma. , 2018, Cancer biomarkers : section A of Disease markers.

[48]  Christopher T. Saunders,et al.  Strelka2: fast and accurate calling of germline and somatic variants , 2018, Nature Methods.

[49]  E. Fraenkel,et al.  Unexpected similarities between C9ORF72 and sporadic forms of ALS/FTD suggest a common disease mechanism , 2018, eLife.

[50]  Renan Valieris,et al.  Bioconda: sustainable and comprehensive software distribution for the life sciences , 2018, Nature Methods.

[51]  M. Giefing,et al.  Recurrent transcriptional loss of the PCDH17 tumor suppressor in laryngeal squamous cell carcinoma is partially mediated by aberrant promoter DNA methylation , 2018, Molecular carcinogenesis.

[52]  Roland Eils,et al.  Spectrum and prevalence of genetic predisposition in medulloblastoma: a retrospective genetic study and prospective validation in a clinical trial cohort , 2018 .

[53]  Michael C. Heinold,et al.  The landscape of genomic alterations across childhood cancers , 2018, Nature.

[54]  Yu Lin,et al.  Assembly of long, error-prone reads using repeat graphs , 2018, Nature Biotechnology.

[55]  Adam M. Phillippy,et al.  MUMmer4: A fast and versatile genome alignment system , 2018, PLoS Comput. Biol..

[56]  David A. Knowles,et al.  Annotation-free quantification of RNA splicing using LeafCutter , 2017, Nature Genetics.

[57]  C. von Kalle,et al.  Precision oncology based on omics data: The NCT Heidelberg experience , 2017, International journal of cancer.

[58]  John D McPherson,et al.  Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line , 2017, bioRxiv.

[59]  Heng Li,et al.  Minimap2: pairwise alignment for nucleotide sequences , 2017, Bioinform..

[60]  Michael C. Schatz,et al.  Accurate detection of complex structural variations using single molecule sequencing , 2017, Nature Methods.

[61]  Ying-li Lin,et al.  Aberrant Promoter Methylation of PCDH17 (Protocadherin 17) in Serum and its Clinical Significance in Renal Cell Carcinoma , 2017, Medical science monitor : international medical journal of experimental and clinical research.

[62]  Roland Eils,et al.  The whole-genome landscape of medulloblastoma subtypes , 2017, Nature.

[63]  A. Isaksson,et al.  Scattered genomic amplification in dedifferentiated liposarcoma , 2017, Molecular Cytogenetics.

[64]  K. Koike,et al.  Aberrant methylation of protocadherin 17 and its prognostic value in pediatric acute lymphoblastic leukemia , 2017, Pediatric blood & cancer.

[65]  Winston Timp,et al.  Detecting DNA cytosine methylation using nanopore sequencing , 2017, Nature Methods.

[66]  Joachim Weischenfeldt,et al.  SvABA: genome-wide detection of structural variants and indels by local assembly , 2018, Genome research.

[67]  I. Petersen,et al.  Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking , 2016, Nature Genetics.

[68]  G. Glazko,et al.  Conservation of the Exon-Intron Structure of Long Intergenic Non-Coding RNA Genes in Eutherian Mammals , 2016, Life.

[69]  David T. W. Jones,et al.  Telomere dysfunction and chromothripsis , 2016, International journal of cancer.

[70]  David C. Jones,et al.  Landscape of somatic mutations in 560 breast cancer whole genome sequences , 2016, Nature.

[71]  Judith B. Zaugg,et al.  Data-driven hypothesis weighting increases detection power in genome-scale multiple testing , 2016, Nature Methods.

[72]  Xiaoyu Chen,et al.  Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications , 2016, Bioinform..

[73]  Peter J. Campbell,et al.  Chromothripsis and Kataegis Induced by Telomere Crisis , 2015, Cell.

[74]  Marcin Imielinski,et al.  Identification of focally amplified lineage-specific super-enhancers in human epithelial cancers , 2015, Nature Genetics.

[75]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[76]  Jonathan K. Pritchard,et al.  WASP: allele-specific software for robust molecular quantitative trait locus discovery , 2015, Nature Methods.

[77]  Leo van Iersel,et al.  WhatsHap: Weighted Haplotype Assembly for Future-Generation Sequencing Reads , 2015, J. Comput. Biol..

[78]  I. Derrington,et al.  Detection and mapping of 5-methylcytosine and 5-hydroxymethylcytosine with nanopore MspA , 2013, Proceedings of the National Academy of Sciences.

[79]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[80]  J. Korbel,et al.  Criteria for Inference of Chromothripsis in Cancer Genomes , 2013, Cell.

[81]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[82]  V. Beneš,et al.  DELLY: structural variant discovery by integrated paired-end and split-read analysis , 2012, Bioinform..

[83]  Gabor T. Marth,et al.  Haplotype-based variant detection from short-read sequencing , 2012, 1207.3907.

[84]  David T. W. Jones,et al.  Genome Sequencing of Pediatric Medulloblastoma Links Catastrophic DNA Rearrangements with TP53 Mutations , 2012, Cell.

[85]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[86]  Hendrik Witt,et al.  Medulloblastoma comprises four distinct molecular variants. , 2011, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[87]  M. Frith,et al.  Adaptive seeds tame genomic sequence comparison. , 2011, Genome research.

[88]  N. Carter,et al.  Massive Genomic Rearrangement Acquired in a Single Catastrophic Event during Cancer Development , 2011, Cell.

[89]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer , 2011, Nature Biotechnology.

[90]  J. Mill,et al.  Allele-specific methylation in the human genome , 2010, Epigenetics.

[91]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[92]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[93]  M. D. Den Boer,et al.  Differential expression and prognostic significance of SOX genes in pediatric medulloblastoma and ependymoma identified by microarray analysis. , 2008, Neuro-oncology.

[94]  D. Bigner,et al.  The gene expression profiles of medulloblastoma cell lines resistant to preactivated cyclophosphamide. , 2008, Current cancer drug targets.

[95]  Andrew Menzies,et al.  Architectures of somatic genomic rearrangement in human cancer amplicons at sequence-level resolution. , 2007, Genome research.

[96]  D. Malkin,et al.  Younger age of cancer initiation is associated with shorter telomere length in Li-Fraumeni syndrome. , 2007, Cancer research.

[97]  Terrence S. Furey,et al.  The UCSC Genome Browser Database: update 2006 , 2005, Nucleic Acids Res..

[98]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[99]  Ho-Keung Ng,et al.  Mutation analysis of DMBT1 in glioblastoma, medulloblastoma and oligodendroglial tumors , 2003, International journal of cancer.

[100]  D. Srivastava,et al.  Tbx1 is regulated by tissue-specific forkhead proteins through a common Sonic hedgehog-responsive enhancer. , 2003, Genes & development.

[101]  B. Bjerkehagen,et al.  Characterization of supernumerary rings and giant marker chromosomes in well-differentiated lipomatous tumors by a combination of G-banding, CGH, M-FISH, and chromosome- and locus-specific FISH , 2002, Cytogenetic and Genome Research.

[102]  J. Newell-Price,et al.  DNA Methylation and Silencing of Gene Expression , 2000, Trends in Endocrinology & Metabolism.

[103]  G Hermanson,et al.  High-resolution mapping of human chromosome 11 by in situ hybridization with cosmid clones. , 1990, Science.

[104]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[105]  E. Keung,et al.  Overview of liposarcomas and their genomic landscape , 2019, Journal of Translational Genetics and Genomics.

[106]  Steven J. M. Jones,et al.  Subgroup-specific structural variation across 1,000 medulloblastoma genomes , 2012, Nature.

[107]  Claude-Alain H. Roten,et al.  Fast and accurate short read alignment with Burrows–Wheeler transform , 2009, Bioinform..

[108]  Joel I. Pritchard,et al.  Methylation of PTCH1, the Patched-1 gene, in a panel of primary medulloblastomas. , 2008, Cancer genetics and cytogenetics.

[109]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[110]  A. Olshen,et al.  A Faster Circular Binary Segmentation Algorithm for the Analysis of Array Cgh Data , 2022 .