PipeIT: Singularity Container for Molecular Diagnostic Somatic Variant Calling on Ion Torrent NGS Platform.

The accurate identification of somatic mutations has become a pivotal component of tumor profiling and precision medicine. In molecular diagnostics laboratories, somatic mutation analyses on the Ion Torrent sequencing platform are typically performed on the Ion Reporter platform, which requires extensive manual review of the results and lacks optimized analysis workflows for custom targeted sequencing panels. Alternative solutions involving custom bioinformatics pipelines involve the sequential execution of software tools with numerous parameters, leading to poor reproducibility and portability. Here we present PipeIT, a stand-alone Singularity container of a somatic mutation calling and filtering pipeline for matched tumor-normal Ion Torrent sequencing data. PipeIT is able to identify pathogenic variants in BRAF, KRAS, PIK3CA, CTNNB1, TP53, and other cancer genes that the clinical-grade Oncomine workflow identified. Additionally, PipeIT analysis of tumor-normal paired data generated on a custom targeted sequencing panel achieved 100% positive predictive value and 99% sensitivity, compared to the 68% to 80% PPV and 92% to 96% sensitivity using the default tumor-normal paired Ion Reporter workflow, substantially reducing the need for manual curation of the results. PipeIT can be rapidly deployed to and ensures reproducible results in any laboratory, and can be executed with a single command with minimal input files from the users.

[1]  S. Gabriel,et al.  Discovery and saturation analysis of cancer genes across 21 tumor types , 2014, Nature.

[2]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[3]  Bernd Rinn,et al.  NGS-pipe: a flexible, easily extendable and highly configurable framework for NGS analysis , 2017, Bioinform..

[4]  Jing Wang,et al.  Strategies for identification of somatic variants using the Ion Torrent deep targeted sequencing platform , 2018, BMC Bioinformatics.

[5]  Alexander Dobrovic,et al.  Dramatic reduction of sequence artefacts from DNA isolated from formalin-fixed cancer biopsies by treatment with uracil-DNA glycosylase , 2012, Oncotarget.

[6]  Gonçalo R. Abecasis,et al.  Unified representation of genetic variants , 2015, Bioinform..

[7]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration , 2012, Briefings Bioinform..

[8]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .

[9]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[10]  E. Lander Initial impact of the sequencing of the human genome , 2011, Nature.

[11]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[12]  N. Paneth,et al.  Seven Questions for Personalized Medicine. , 2015, JAMA.

[13]  N. Socci,et al.  Identifying recurrent mutations in cancer reveals widespread lineage diversity and mutational specificity , 2015, Nature Biotechnology.

[14]  Alexis B. Carter,et al.  Standards and Guidelines for Validating Next-Generation Sequencing Bioinformatics Pipelines: A Joint Recommendation of the Association for Molecular Pathology and the College of American Pathologists. , 2018, The Journal of molecular diagnostics : JMD.

[15]  Hanna Lee,et al.  AIRVF: a filtering toolbox for precise variant calling in Ion Torrent sequencing , 2018, Bioinform..

[16]  L. Terracciano,et al.  Genetic profiling using plasma-derived cell-free DNA in therapy-naïve hepatocellular carcinoma patients: a pilot study , 2018, Annals of oncology : official journal of the European Society for Medical Oncology.

[17]  Aaron R. Quinlan,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2022 .

[18]  Steven J. M. Jones,et al.  Comprehensive Characterization of Cancer Driver Genes and Mutations , 2018, Cell.

[19]  C. Sander,et al.  3D clusters of somatic mutations in cancer reveal numerous rare mutations as functional targets , 2017, Genome Medicine.

[20]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[21]  L. Terracciano,et al.  Diagnostic Targeted Sequencing Panel for Hepatocellular Carcinoma Genomic Screening. , 2018, The Journal of molecular diagnostics : JMD.

[22]  Melissa P. Murray,et al.  The Genomic Landscape of Male Breast Cancers , 2016, Clinical Cancer Research.

[23]  Hanspeter Pfister,et al.  UpSet: Visualization of Intersecting Sets , 2014, IEEE Transactions on Visualization and Computer Graphics.

[24]  Vanessa Sochat,et al.  Singularity: Scientific containers for mobility of compute , 2017, PloS one.