FMFilter: A fast model based variant filtering tool

The availability of whole exome and genome sequencing has completely changed the structure of genetic disease studies. It is now possible to solve the disease causing mechanisms within shorter time and budgets. For this reason, mining out the valuable information from the huge amount of data produced by next generation techniques becomes a challenging task. Current tools analyze sequencing data in various methods. However, there is still need for fast, easy to use and efficacious tools. Considering genetic disease studies, there is a lack of publicly available tools which support compound heterozygous and de novo models. Also, existing tools either require advanced IT expertise or are inefficient for handling large variant files. In this work, we provide FMFilter, an efficient sieving tool for next generation sequencing data produced by genetic disease studies. We develop a software which allows to choose the inheritance model (recessive, dominant, compound heterozygous and de novo), the affected and control individuals. The program provides a user friendly Graphical User Interface which eliminates the requirement of advanced computer techniques. It has various filtering options which enable to eliminate the majority of the false alarms. FMFilter requires negligible memory, therefore it can easily handle very large variant files like multiple whole genomes with ordinary computers. We demonstrate the variant reduction capability and effectiveness of the proposed tool with public and in-house data for different inheritance models. We also compare FMFilter with the existing filtering software. We conclude that FMFilter provides an effective and easy to use environment for analyzing next generation sequencing data from Mendelian diseases.

[1]  U. Ozbek,et al.  A clinical variant in SCN1A inherited from a mosaic father cosegregates with a novel variant to cause Dravet syndrome in a consanguineous family , 2015, Epilepsy Research.

[2]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[3]  S. Gabriel,et al.  Biallelic mutations in SNX14 cause a syndromic form of cerebellar atrophy and lysosome-autophagosome dysfunction , 2015, Nature Genetics.

[4]  Eric D. Green,et al.  VarSifter: Visualizing and analyzing exome-scale sequence variation data on a desktop computer , 2012, Bioinform..

[5]  Eran Halperin,et al.  Identifying Personal Genomes by Surname Inference , 2013, Science.

[6]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[7]  M. Seven,et al.  Whole-exome sequencing revealed two novel mutations in Usher syndrome. , 2015, Gene.

[8]  Tom Kamphans,et al.  Filtering for Compound Heterozygous Sequence Variants in Non-Consanguineous Pedigrees , 2013, PloS one.

[9]  Ulf Gyllensten,et al.  CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects , 2014, Database J. Biol. Databases Curation.

[10]  Aaron R. Quinlan,et al.  GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations , 2013, PLoS Comput. Biol..

[11]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[12]  J. Duncan,et al.  exomeSuite: Whole exome sequence variant filtering tool for rapid identification of putative disease causing SNVs/indels. , 2014, Genomics.

[13]  C. Creighton,et al.  Novel POC1A mutation in primordial dwarfism reveals new insights for centriole biogenesis. , 2015, Human molecular genetics.

[14]  Mulin Jun Li,et al.  wKGGSeq: A Comprehensive Strategy‐Based and Disease‐Targeted Online Framework to Facilitate Exome Sequencing Studies of Inherited Disorders , 2015, Human mutation.

[15]  J. Veltman,et al.  De novo mutations in human genetic disease , 2012, Nature Reviews Genetics.