A new approach (EDIZ) for big data variant prioritization

Whole exome sequencing (WES), workflow consists of the following steps: raw data quality assessment, pre-processing, alignment, post-processing, variant calling, annotation, and prioritization. WES of human samples was reported to detect approximately 20,000–30,000 SNV and indel calls on average. Therefore, it is very important to choose the best tool that suits the related study. In this study, we aimed to upgrade our previous in-house variant prioritization method to analyse WES data without using in silico methods. By this method, the annotated data have been decreased by means of 52.3 times. Therefore, we both established a successful WES workflow for increasing the diagnostic rate of patients with reducing the raw data. Recently, we are also building a web-based workflow to help the users from all over the world.

[1]  Angela D. Wilkins,et al.  Single nucleotide variations: Biological impact and theoretical interpretation , 2014, Protein science : a publication of the Protein Society.

[2]  Michael R. Speicher,et al.  A survey of tools for variant analysis of next-generation genome sequencing data , 2013, Briefings Bioinform..

[3]  J. Biegel,et al.  A semiautomated whole-exome sequencing workflow leads to increased diagnostic yield and identification of novel candidate variants , 2019, Cold Spring Harbor molecular case studies.

[4]  M. Ergun,et al.  A new method for analysis of whole exome sequencing data (SELIM) depending on variant prioritization , 2017 .

[5]  Hui Yang,et al.  Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR , 2015, Nature Protocols.

[6]  K. Eilbeck,et al.  Settling the score: variant prioritization and Mendelian disease , 2017, Nature Reviews Genetics.

[7]  I. Tikhonova,et al.  Genetic diagnosis by whole exome capture and massively parallel DNA sequencing , 2009, Proceedings of the National Academy of Sciences.

[8]  H. Hakonarson,et al.  Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing , 2013, Genome Medicine.

[9]  Ryan W. Kim,et al.  Carrier Testing for Severe Childhood Recessive Diseases by Next-Generation Sequencing , 2011, Science Translational Medicine.

[10]  Christian Gilissen,et al.  Disease gene identification strategies for exome sequencing , 2012, European Journal of Human Genetics.

[11]  P. Robinson,et al.  Strategies for exome and genome sequence data analysis in disease‐gene discovery projects , 2011, Clinical genetics.

[12]  W. Kibbe,et al.  Review of Current Methods, Applications, and Data Management for the Bioinformatics Analysis of Whole Exome Sequencing , 2014, Cancer informatics.