GENE-IS: Time-Efficient and Accurate Analysis of Viral Integration Events in Large-Scale Gene Therapy Data

Integration site profiling and clonality analysis of viral vector distribution in gene therapy is a key factor to monitor the fate of gene-corrected cells, assess the risk of malignant transformation, and establish vector biosafety. We developed the Genome Integration Site Analysis Pipeline (GENE-IS) for highly time-efficient and accurate detection of next-generation sequencing (NGS)-based viral vector integration sites (ISs) in gene therapy data. It is the first available tool with dual analysis mode that allows IS analysis both in data generated by PCR-based methods, such as linear amplification method PCR (LAM-PCR), and by rapidly evolving targeted sequencing (e.g., Agilent SureSelect) technologies. GENE-IS makes use of trimming strategies, customized reference genome, and soft-clipped information with sequential filtering steps to provide annotated IS with clonality information. It is a scalable, robust, precise, and reliable tool for large-scale pre-clinical and clinical data analysis that provides users complete flexibility and control over analysis with a broad range of configurable parameters. GENE-IS is available at https://github.com/G100DKFZ/gene-is.

[1]  Ergun Gumus,et al.  A genome-wide analysis of lentivector integration sites using targeted sequence capture and next generation sequencing technology. , 2012, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[2]  B. Smart Stem-Cell Gene Therapy for the Wiskott-Aldrich Syndrome , 2011, Pediatrics.

[3]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[4]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[5]  Ivan Merelli,et al.  VISPA2: a scalable pipeline for high-throughput identification and annotation of vector integration sites , 2017, BMC Bioinformatics.

[6]  Zhongming Zhao,et al.  VirusFinder: Software for Efficient and Accurate Detection of Viruses and Their Integration Sites in Host Genomes through Next Generation Sequencing Data , 2013, PloS one.

[7]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.

[8]  David Haussler,et al.  The UCSC genome browser and associated tools , 2012, Briefings Bioinform..

[9]  C. von Kalle,et al.  Preclinical Evaluation of Efficacy and Safety of an Improved Lentiviral Vector for the Treatment of β-Thalassemia and Sickle Cell Disease , 2014, Current gene therapy.

[10]  G Opelz,et al.  QuickMap: a public tool for large-scale gene therapy vector insertion site mapping and analysis , 2009, Gene Therapy.

[11]  Zhongming Zhao,et al.  VERSE: a novel approach to detect virus integration in host genomes through reference genome customization , 2015, Genome Medicine.

[12]  Surbhi Jain,et al.  ChimericSeq: An open-source, user-friendly interface for analyzing NGS data to identify and characterize viral-host chimeric sequences , 2017, PloS one.

[13]  David J. Adams,et al.  Cancer Gene Discovery: Exploiting Insertional Mutagenesis , 2013, Molecular Cancer Research.

[14]  D. Grimm,et al.  AAV vector-mediated in vivo reprogramming into pluripotency , 2018, Nature Communications.

[15]  Gianluigi Zanetti,et al.  VISPA: a computational pipeline for the identification and analysis of genomic vector integration sites , 2014, Genome Medicine.

[16]  Shuifang Zhu,et al.  Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads , 2014, BMC Bioinformatics.

[17]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[18]  Michael Recht,et al.  Long-term safety and efficacy of factor IX gene therapy in hemophilia B. , 2014, The New England journal of medicine.

[19]  Terrence S. Furey,et al.  The UCSC Table Browser data retrieval tool , 2004, Nucleic Acids Res..

[20]  Daniel Ho,et al.  Virus-Clip: a fast and memory-efficient viral integration site detection tool at single-base resolution with annotation capability , 2015, Oncotarget.

[21]  Brian Beard,et al.  VISA - Vector Integration Site Analysis server: a web-based server to rapidly identify retroviral integration sites from next-generation sequencing , 2015, BMC Bioinformatics.

[22]  Zhen Yue,et al.  pIRS: Profile-based Illumina pair-end reads simulator , 2012, Bioinform..

[23]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[24]  Nick Tyler,et al.  Effect of gene therapy on visual function in Leber's congenital amaurosis. , 2008, The New England journal of medicine.

[25]  Ting-Fung Chan,et al.  ViralFusionSeq: accurately discover viral integration events and reconstruct fusion transcripts at single-base resolution , 2013, Bioinform..

[26]  Aaron R. Quinlan,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2022 .

[27]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[28]  A. Mortellaro,et al.  Correction of ADA-SCID by Stem Cell Gene Therapy Combined with Nonmyeloablative Conditioning , 2002, Science.

[29]  Hanno Glimm,et al.  High-resolution insertion-site analysis by linear amplification–mediated PCR (LAM-PCR) , 2007, Nature Methods.

[30]  Christof von Kalle,et al.  Bioinformatic clonality analysis of next-generation sequencing-derived viral vector integration sites. , 2012, Human gene therapy methods.

[31]  S. Szymczak,et al.  Vy-PER: eliminating false positive detection of virus integration events in next generation sequencing data , 2015, Scientific Reports.