Knime4Bio: a set of custom nodes for the interpretation of next-generation sequencing data with KNIME†

Summary: Analysing large amounts of data generated by next-generation sequencing (NGS) technologies is difficult for researchers or clinicians without computational skills. They are often compelled to delegate this task to computer biologists working with command line utilities. The availability of easy-to-use tools will become essential with the generalization of NGS in research and diagnosis. It will enable investigators to handle much more of the analysis. Here, we describe Knime4Bio, a set of custom nodes for the KNIME (The Konstanz Information Miner) interactive graphical workbench, for the interpretation of large biological datasets. We demonstrate that this tool can be utilized to quickly retrieve previously published scientific findings. Availability: http://code.google.com/p/knime4bio/. Contact: richard.redon@univ-nantes.fr

[1]  Carole A. Goble,et al.  Taverna: a tool for building and running workflows of services , 2006, Nucleic Acids Res..

[2]  S. Henikoff,et al.  Predicting deleterious amino acid substitutions. , 2001, Genome research.

[3]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[4]  Bernd Wiswedel,et al.  Extending KNIME for next-generation sequencing data analysis , 2011, Bioinform..

[5]  A. Stokke,et al.  And and And , 2013 .

[6]  Christian von Mering,et al.  STRING 7—recent developments in the integration and prediction of protein interactions , 2006, Nucleic Acids Res..

[7]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[8]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[9]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[10]  Pierre Tufféry,et al.  BIOINFORMATICS ORIGINAL PAPER , 2022 .

[11]  David Haussler,et al.  The UCSC Known Genes , 2006, Bioinform..

[12]  Pierre Lindenbaum,et al.  Truncating mutations in the last exon of NOTCH2 cause a rare skeletal disorder with osteoporosis , 2011, Nature Genetics.

[13]  Thorsten Meinl,et al.  KNIME: The Konstanz Information Miner , 2007, GfKl.

[14]  Heng Li,et al.  Tabix: fast retrieval of sequence features from generic TAB-delimited files , 2011, Bioinform..

[15]  Roeland C. H. J. van Ham,et al.  High-throughput bioinformatics with the Cyrille2 pipeline system , 2008, BMC Bioinformatics.

[16]  E. Boerwinkle,et al.  dbNSFP: A Lightweight Database of Human Nonsynonymous SNPs and Their Functional Predictions , 2011, Human mutation.

[17]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer , 2011, Nature Biotechnology.

[18]  Galt P. Barber,et al.  BigWig and BigBed: enabling browsing of large distributed datasets , 2010, Bioinform..

[19]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[20]  Anton Nekrutenko,et al.  Integrating diverse databases into an unified analysis framework: a Galaxy approach , 2011, Database J. Biol. Databases Curation.