Bioinformatics for precision oncology

Abstract Molecular profiling of tumor biopsies plays an increasingly important role not only in cancer research, but also in the clinical management of cancer patients. Multi-omics approaches hold the promise of improving diagnostics, prognostics and personalized treatment. To deliver on this promise of precision oncology, appropriate bioinformatics methods for managing, integrating and analyzing large and complex data are necessary. Here, we discuss the specific requirements of bioinformatics methods and software that arise in the setting of clinical oncology, owing to a stricter regulatory environment and the need for rapid, highly reproducible and robust procedures. We describe the workflow of a molecular tumor board and the specific bioinformatics support that it requires, from the primary analysis of raw molecular profiling data to the automatic generation of a clinical report and its delivery to decision-making clinical oncologists. Such workflows have to various degrees been implemented in many clinical trials, as well as in molecular tumor boards at specialized cancer centers and university hospitals worldwide. We review these and more recent efforts to include other high-dimensional multi-omics patient profiles into the tumor board, as well as the state of clinical decision support software to translate molecular findings into treatment recommendations.

[1]  Andrew E. Jaffe,et al.  Bioinformatics Applications Note Gene Expression the Sva Package for Removing Batch Effects and Other Unwanted Variation in High-throughput Experiments , 2022 .

[2]  Funda Meric-Bernstam,et al.  Feasibility of Large-Scale Genomic Testing to Facilitate Enrollment Onto Genomically Matched Clinical Trials. , 2015, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[3]  Carme Camps,et al.  Clinical applicability and cost of a 46-gene panel for genomic analysis of solid tumours: Retrospective validation and prospective audit in the UK National Health Service , 2017, PLoS medicine.

[4]  A. Legat,et al.  Vaccination with LAG-3Ig (IMP321) and Peptides Induces Specific CD4 and CD8 T-Cell Responses in Metastatic Melanoma Patients—Report of a Phase I/IIa Clinical Trial , 2015, Clinical Cancer Research.

[5]  Marilyn M. Li,et al.  Standards and Guidelines for the Interpretation and Reporting of Sequence Variants in Cancer: A Joint Consensus Recommendation of the Association for Molecular Pathology, American Society of Clinical Oncology, and College of American Pathologists. , 2017, The Journal of molecular diagnostics : JMD.

[6]  Roman Rouzier,et al.  Treatment Algorithms Based on Tumor Molecular Profiling: The Essence of Precision Medicine Trials , 2015, Journal of the National Cancer Institute.

[7]  Mingming Jia,et al.  COSMIC: exploring the world's knowledge of somatic mutations in human cancer , 2014, Nucleic Acids Res..

[8]  Paolo Di Tommaso,et al.  Nextflow enables reproducible computational workflows , 2017, Nature Biotechnology.

[9]  Mauricio O. Carneiro,et al.  From FastQ Data to High‐Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline , 2013, Current protocols in bioinformatics.

[10]  Deng Pan,et al.  DGIdb 2.0: mining clinically relevant drug–gene interactions , 2015, Nucleic Acids Res..

[11]  R. Altman,et al.  Pharmacogenomics Knowledge for Personalized Medicine , 2012, Clinical pharmacology and therapeutics.

[12]  Peiyong Guan,et al.  Structural variation detection using next-generation sequencing data: A comparative technical review. , 2016, Methods.

[13]  W. Koh,et al.  Single-cell genome sequencing: current state of the science , 2016, Nature Reviews Genetics.

[14]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[15]  J. Miller,et al.  Predicting the Functional Effect of Amino Acid Substitutions and Indels , 2012, PloS one.

[16]  Deanna M. Church,et al.  ClinVar: public archive of relationships among sequence variation and human phenotype , 2013, Nucleic Acids Res..

[17]  K. Kinzler,et al.  Cancer Genome Landscapes , 2013, Science.

[18]  Benjamin E. Gross,et al.  The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. , 2012, Cancer discovery.

[19]  Moriah H Nissan,et al.  OncoKB: A Precision Oncology Knowledge Base. , 2017, JCO precision oncology.

[20]  Vladimir Vacic,et al.  Whole-Exome Sequencing of Metastatic Cancer and Biomarkers of Treatment Response. , 2015, JAMA oncology.

[21]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[22]  David T. W. Jones,et al.  Next-generation personalised medicine for high-risk paediatric cancer patients - The INFORM pilot study. , 2016, European journal of cancer.

[23]  Jack Kuipers,et al.  Detailed simulation of cancer exome sequencing data reveals differences and common limitations of variant callers , 2017, BMC Bioinformatics.

[24]  Niko Beerenwinkel,et al.  Genomic variant annotation workflow for clinical applications. , 2016, F1000Research.

[25]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[26]  Q. Waisfisz,et al.  Reflecting on Earlier Experiences with Unsolicited Findings: Points to Consider for Next-Generation Sequencing and Informed Consent in Diagnostics , 2013, Human mutation.

[27]  P. A. Futreal,et al.  Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. , 2012, The New England journal of medicine.

[28]  Bernd Rinn,et al.  NGS-pipe: a flexible, easily extendable and highly configurable framework for NGS analysis , 2017, Bioinform..

[29]  Wendy S. W. Wong,et al.  Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs , 2012, Bioinform..

[30]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[31]  B. Giusti,et al.  EXCAVATOR: detecting copy number variants from whole-exome sequencing data , 2013, Genome Biology.

[32]  Jos Jonkers,et al.  CopywriteR: DNA copy number detection from off-target sequence data , 2015, Genome Biology.

[33]  Ryan M. Layer,et al.  LUMPY: a probabilistic framework for structural variant discovery , 2012, Genome Biology.

[34]  A. Hauschild,et al.  Improved survival with vemurafenib in melanoma with BRAF V600E mutation. , 2011, The New England journal of medicine.

[35]  Peter J. Park,et al.  Evaluation of somatic copy number estimation tools for whole-exome sequencing data , 2016, Briefings Bioinform..

[36]  Razelle Kurzrock,et al.  Breast Cancer Experience of the Molecular Tumor Board at the University of California, San Diego Moores Cancer Center. , 2015, Journal of oncology practice.

[37]  Marie-Cécile Le Deley,et al.  High-Throughput Genomics and Clinical Outcome in Hard-to-Treat Advanced Cancers: Results of the MOSCATO 01 Trial. , 2017, Cancer discovery.

[38]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[39]  Peter Bauer,et al.  SeqPurge: highly-sensitive adapter trimming for paired-end NGS data , 2016, BMC Bioinformatics.

[40]  B. Lane,et al.  Development of a Center for Personalized Cancer Care at a Regional Cancer Center: Feasibility Trial of an Institutional Tumor Sequencing Advisory Board. , 2015, The Journal of molecular diagnostics : JMD.

[41]  R. Gray,et al.  Abstract CT101: NCI-molecular analysis for therapy choice (NCI-MATCH) clinical trial: interim analysis , 2016 .

[42]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[43]  Gert Matthijs,et al.  Guidelines for diagnostic next-generation sequencing , 2015, European Journal of Human Genetics.

[44]  E. Boerwinkle,et al.  dbNSFP v3.0: A One‐Stop Database of Functional Predictions and Annotations for Human Nonsynonymous and Splice‐Site SNVs , 2016, Human mutation.

[45]  Sven Rahmann,et al.  Genome analysis , 2022 .

[46]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[47]  Wei Shi,et al.  featureCounts: an efficient general purpose program for assigning sequence reads to genomic features , 2013, Bioinform..

[48]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[49]  M. Gerstung,et al.  Reliable detection of subclonal single-nucleotide variants in tumour cell populations , 2012, Nature Communications.

[50]  K. Russell,et al.  Treatment of patients with refractory metastatic cancer according to molecular profiling on tumor tissue in the clinical routine: an interim-analysis of the ONCO-T-PROFILE project , 2016, Genes & cancer.

[51]  Funda Meric-Bernstam,et al.  Bias from removing read duplication in ultra-deep sequencing experiments , 2014, Bioinform..

[52]  S. Barry,et al.  Capturing complex tumour biology in vitro: histological and molecular characterisation of precision cut slices , 2015, Scientific Reports.

[53]  P. Park,et al.  Copy number analysis of whole-genome data using BIC-seq2 and its application to detection of cancer susceptibility variants , 2016, Nucleic acids research.

[54]  Michael P. Schroeder,et al.  In silico prescription of anticancer drugs to cohorts of 28 tumor types reveals targeting opportunities. , 2015, Cancer cell.

[55]  J. Potash,et al.  Validation and assessment of variant calling pipelines for next-generation sequencing , 2014, Human Genomics.

[56]  E. Boerwinkle,et al.  dbNSFP: A Lightweight Database of Human Nonsynonymous SNPs and Their Functional Predictions , 2011, Human mutation.

[57]  Birgit Funke,et al.  College of American Pathologists' laboratory standards for next-generation sequencing clinical tests. , 2015, Archives of pathology & laboratory medicine.

[58]  Michael C. Heinold,et al.  A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing , 2015, Nature Communications.

[59]  Marina N Nikiforova,et al.  Guidelines for Validation of Next-Generation Sequencing-Based Oncology Panels: A Joint Consensus Recommendation of the Association for Molecular Pathology and College of American Pathologists. , 2017, The Journal of molecular diagnostics : JMD.

[60]  Mary Goldman,et al.  Toil enables reproducible, open source, big biomedical data analyses , 2017, Nature Biotechnology.

[61]  M. Morgante,et al.  An Extensive Evaluation of Read Trimming Effects on Illumina NGS Data Analysis , 2013, PloS one.

[62]  Bernard J. Pope,et al.  Bpipe: a tool for running and managing bioinformatics pipelines , 2012, Bioinform..

[63]  U. Kees,et al.  Rare childhood cancers—an increasing entity requiring the need for global consensus and collaboration , 2015, Cancer medicine.

[64]  Benjamin Schubert,et al.  OptiType: precision HLA typing from next-generation sequencing data , 2014, Bioinform..

[65]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[66]  L. Kvols,et al.  A Phase II study of high‐dose paclitaxel in patients with advanced neuroendocrine tumors , 2001, Cancer.

[67]  Lin He,et al.  In-depth comparison of somatic point mutation callers based on different tumor next-generation sequencing depth data , 2016, Scientific Reports.

[68]  Qingguo Wang,et al.  Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives , 2013, BMC Bioinformatics.

[69]  O. Delattre,et al.  Feasibility and clinical integration of molecular profiling for target identification in pediatric solid tumors , 2017, Pediatric blood & cancer.

[70]  A. Redig,et al.  Basket trials and the evolution of clinical trial design in an era of genomic medicine. , 2015, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[71]  Nuno A. Fonseca,et al.  Tools for mapping high-throughput sequencing data , 2012, Bioinform..

[72]  Faraz Hach,et al.  SiNVICT: ultra-sensitive detection of single nucleotide variants and indels in circulating tumour DNA , 2017, Bioinform..

[73]  Kai Ye,et al.  Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads , 2009, Bioinform..

[74]  David T. W. Jones,et al.  Signatures of mutational processes in human cancer , 2013, Nature.

[75]  Steven J. M. Jones,et al.  CIViC is a community knowledgebase for expert crowdsourcing the clinical interpretation of variants in cancer , 2017, Nature Genetics.

[76]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .

[77]  Gavin R. Oliver,et al.  Experience with precision genomics and tumor board, indicates frequent target identification, but barriers to delivery , 2017, OncoTarget.

[78]  Marcel Martin Cutadapt removes adapter sequences from high-throughput sequencing reads , 2011 .

[79]  Ellen T. Gelfand,et al.  The Genotype-Tissue Expression (GTEx) project , 2013, Nature Genetics.

[80]  J. Blay,et al.  Vemurafenib in Multiple Nonmelanoma Cancers with BRAF V600 Mutations. , 2015, The New England journal of medicine.

[81]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[82]  Zhiyong Lu,et al.  Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine , 2016, PLoS Comput. Biol..

[83]  Michael R. Speicher,et al.  A survey of tools for variant analysis of next-generation genome sequencing data , 2013, Briefings Bioinform..

[84]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[85]  Christopher A. Miller,et al.  VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. , 2012, Genome research.

[86]  Michael P. Schroeder,et al.  IntOGen-mutations identifies cancer drivers across tumor types , 2013, Nature Methods.

[87]  J. Long,et al.  Steps to ensure accuracy in genotype and SNP calling from Illumina sequencing data , 2012, BMC Genomics.

[88]  Thomas Zichner,et al.  DELLY: structural variant discovery by integrated paired-end and split-read analysis , 2012, Bioinform..

[89]  Michael Kalmbach,et al.  Impact of post-alignment processing in variant discovery from whole exome data , 2016, BMC Bioinformatics.

[90]  Sohrab P. Shah,et al.  JointSNVMix: a probabilistic model for accurate detection of somatic mutations in normal/tumour paired next-generation sequencing data , 2012, Bioinform..

[91]  Peter J. Campbell,et al.  Subclonal variant calling with multiple samples and prior knowledge , 2014, Bioinform..

[92]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[93]  Cole Trapnell,et al.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions , 2013, Genome Biology.

[94]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[95]  N. McGranahan,et al.  The causes and consequences of genetic heterogeneity in cancer evolution , 2013, Nature.

[96]  Sampsa Hautaniemi,et al.  Comparative analysis of methods for identifying somatic copy number alterations from deep sequencing data , 2015, Briefings Bioinform..

[97]  Davis J. McCarthy,et al.  Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation , 2012, Nucleic acids research.

[98]  Davide Prandi,et al.  Personalized In Vitro and In Vivo Cancer Models to Guide Precision Medicine. , 2017, Cancer discovery.

[99]  A. Sivachenko,et al.  Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples , 2013, Nature Biotechnology.

[100]  Emmanuel Barillot,et al.  SVDetect: a tool to identify genomic structural variations from paired-end and mate-pair sequencing data , 2010, Bioinform..

[101]  Stuart J. Andrews,et al.  Implementation of next generation sequencing into pediatric hematology-oncology practice: moving beyond actionable alterations , 2016, Genome Medicine.

[102]  K. Sirotkin,et al.  dbSNP-database for single nucleotide polymorphisms and other classes of minor genetic variation. , 1999, Genome research.

[103]  Gary D Bader,et al.  Comprehensive identification of mutational cancer driver genes across 12 tumor types , 2013, Scientific Reports.

[104]  A. Magi,et al.  Detection of Genomic Structural Variants from Next-Generation Sequencing Data , 2015, Front. Bioeng. Biotechnol..

[105]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[106]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[107]  S. Lippman,et al.  Molecular tumor board: the University of California-San Diego Moores Cancer Center experience. , 2014, The oncologist.

[108]  Mads Thomassen,et al.  Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data , 2016, PloS one.

[109]  Valentine Svensson,et al.  Power Analysis of Single Cell RNA-Sequencing Experiments , 2016, Nature Methods.

[110]  Benjamin J. Raphael,et al.  Expanding the computational toolbox for mining cancer genomes , 2014, Nature Reviews Genetics.

[111]  Eckart Meese,et al.  DrugTargetInspector: An assistance tool for patient treatment stratification , 2016, International journal of cancer.

[112]  Sven Rahmann,et al.  Snakemake--a scalable bioinformatics workflow engine. , 2012, Bioinformatics.

[113]  Benjamin M. Good,et al.  Organizing knowledge to enable personalization of medicine in cancer , 2014, Genome Biology.

[114]  J. Wolchok,et al.  Genetic basis for clinical response to CTLA-4 blockade in melanoma. , 2014, The New England journal of medicine.