Bioinformatics tools and databases for analysis of next-generation sequence data.

Genome sequencing has been revolutionized by next-generation technologies, which can rapidly produce vast quantities of data at relatively low cost. With data production now no longer being limited, there is a huge challenge to analyse the data flood and interpret biological meaning. Bioinformatics scientists have risen to the challenge and a large number of software tools and databases have been produced and these continue to evolve with this rapidly advancing field. Here, we outline some of the tools and databases commonly used for the analysis of next-generation sequence data with comment on their utility.

[1]  Marie E. Bolger,et al.  Plant genome sequencing - applications for crop improvement. , 2014, Current opinion in biotechnology.

[2]  S. Haig,et al.  Multiplexed microsatellite recovery using massively parallel sequencing , 2011, Molecular ecology resources.

[3]  J. Poulain,et al.  The genome of the mesopolyploid crop species Brassica rapa , 2011, Nature Genetics.

[4]  J. Batley,et al.  Sequencing and assembly of low copy and genic regions of isolated Triticum aestivum chromosome arm 7DS. , 2011, Plant biotechnology journal.

[5]  Hui Shen,et al.  Comparative studies of de novo assembly tools for next-generation sequencing technologies , 2011, Bioinform..

[6]  Tao Jiang,et al.  Uncover disease genes by maximizing information flow in the phenome–interactome network , 2011, Bioinform..

[7]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[8]  Lucian Ilie,et al.  SHRiMP2: Sensitive yet Practical Short Read Mapping , 2011, Bioinform..

[9]  Kenneth H. Buetow,et al.  Bioinformatics Applications Note Sequence Analysis Bambino: a Variant Detector and Alignment Viewer for Next-generation Sequencing Data in the Sam/bam Format , 2022 .

[10]  Bairong Shen,et al.  A Practical Comparison of De Novo Genome Assembly Software Tools for Next-Generation Sequencing Technologies , 2011, PloS one.

[11]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer , 2011, Nature Biotechnology.

[12]  Daniel Rios,et al.  Ensembl 2011 , 2010, Nucleic Acids Res..

[13]  Wing-Kin Sung,et al.  PE-Assembler: de novo assembler using short paired-end reads , 2011, Bioinform..

[14]  A. Gnirke,et al.  High-quality draft assemblies of mammalian genomes from massively parallel sequence data , 2010, Proceedings of the National Academy of Sciences.

[15]  J. Batley,et al.  Future tools for association mapping in crop plants. , 2010, Genome.

[16]  Jian Wang,et al.  Genome-wide patterns of genetic variation among elite maize inbred lines , 2010, Nature Genetics.

[17]  Qifa Zhang,et al.  Genome-wide association studies of 14 agronomic traits in rice landraces , 2010, Nature Genetics.

[18]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[19]  D. Edwards,et al.  Targeted identification of genomic regions using TAGdb , 2010, Plant Methods.

[20]  Michael Brudno,et al.  Savant: genome browser for high-throughput sequencing data , 2010, Bioinform..

[21]  Xiaokun Li,et al.  MagicViewer: integrated solution for next-generation sequencing data visualization and genetic variation detection and annotation , 2010, Nucleic Acids Res..

[22]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[23]  Ernesto Picardi,et al.  Bioinformatics approaches for genomics and post genomics applications of next-generation sequencing , 2010, Briefings Bioinform..

[24]  Dawei Li,et al.  The sequence and de novo assembly of the giant panda genome , 2010, Nature.

[25]  Huanming Yang,et al.  De novo assembly of human genomes with massively parallel short read sequencing. , 2010, Genome research.

[26]  Paul D. Shaw,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2022 .

[27]  Daniel R. Zerbino,et al.  Pebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler , 2009, PloS one.

[28]  S. Nelson,et al.  BFAST: An Alignment Tool for Large Scale Genome Resequencing , 2009, PloS one.

[29]  David Edwards,et al.  De novo sequencing of plant genomes using second-generation technologies , 2009, Briefings Bioinform..

[30]  A. Gnirke,et al.  ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads , 2009, Genome Biology.

[31]  Siu-Ming Yiu,et al.  SOAP2: an improved ultrafast tool for short read alignment , 2009, Bioinform..

[32]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[33]  Steven J. M. Jones,et al.  Abyss: a Parallel Assembler for Short Read Sequence Data Material Supplemental Open Access , 2022 .

[34]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[35]  David Edwards,et al.  Single nucleotide polymorphism discovery in barley using autoSNPdb. , 2009, Plant biotechnology journal.

[36]  David Edwards,et al.  Discovering genetic polymorphisms in next-generation sequencing data. , 2009, Plant biotechnology journal.

[37]  Michael Brudno,et al.  SHRiMP: Accurate Mapping of Short Color-space Reads , 2009, PLoS Comput. Biol..

[38]  J. Batley,et al.  Genome sequence data: management, storage, and visualization. , 2009, BioTechniques.

[39]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[40]  N. Gemmell,et al.  Fast, cost-effective development of species-specific microsatellite markers by genomic sequencing. , 2009, BioTechniques.

[41]  M. Wingfield,et al.  Microsatellite discovery by deep sequencing of enriched genomic libraries. , 2009, BioTechniques.

[42]  David Wood,et al.  AutoSNPdb: an annotated single nucleotide polymorphism database for crop plants , 2008, Nucleic Acids Res..

[43]  S. Grimmond,et al.  Genome sequencing approaches and successes. , 2009, Methods in molecular biology.

[44]  J. Batley,et al.  Molecular Genetic Markers: Discovery, Applications, Data Storage and Visualisation , 2009 .

[45]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[46]  Dawei Li,et al.  The diploid genome sequence of an Asian individual , 2008, Nature.

[47]  R. Durbin,et al.  Mapping Quality Scores Mapping Short Dna Sequencing Reads and Calling Variants Using P

, 2022 .

[48]  Evandro Novaes,et al.  High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome , 2008, BMC Genomics.

[49]  C. Nusbaum,et al.  ALLPATHS: de novo assembly of whole-genome shotgun microreads. , 2008, Genome research.

[50]  C. Nusbaum,et al.  Quality scores and SNP detection in sequencing-by-synthesis systems. , 2008, Genome research.

[51]  David Hernández,et al.  De novo bacterial genome sequencing: millions of very short reads assembled on a desktop computer. , 2008, Genome research.

[52]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[53]  Ruiqiang Li,et al.  SOAP: short oligonucleotide alignment program , 2008, Bioinform..

[54]  Mark J. P. Chaisson,et al.  Short read fragment assembly of bacterial genomes. , 2008, Genome research.

[55]  Gabor T. Marth,et al.  Pyrobayes: an improved base caller for SNP discovery in pyrosequences , 2008, Nature Methods.

[56]  Knut Reinert,et al.  SeqAn An efficient, generic C++ library for sequence analysis , 2008, BMC Bioinformatics.

[57]  Juliane C. Dohm,et al.  SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing. , 2007, Genome research.

[58]  Vincent J. Magrini,et al.  Extending assembly of short DNA sequences to handle error , 2007, Bioinform..

[59]  P. Schnable,et al.  SNP discovery via 454 transcriptome sequencing , 2007, The Plant journal : for cell and molecular biology.

[60]  René L. Warren,et al.  Assembling millions of short DNA sequences using SSAKE , 2006, Bioinform..

[61]  Xi Li,et al.  SSRPrimer and SSR Taxonomy Tree: Biome SSR discovery , 2006, Nucleic Acids Res..

[62]  Andrew J. Robinson,et al.  Simple sequence repeat marker loci discovery using SSR primer. , 2004, Bioinformatics.

[63]  J. Batley,et al.  Mining for Single Nucleotide Polymorphisms and Insertions/Deletions in Maize Expressed Sequence Tag Data1 , 2003, Plant Physiology.

[64]  David Edwards,et al.  Redundancy based detection of sequence polymorphisms in expressed sequence tag data using autoSNP , 2003, Bioinform..

[65]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[66]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[67]  P. Pevzner,et al.  An Eulerian path approach to DNA fragment assembly , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[68]  Lukas Wagner,et al.  A Greedy Algorithm for Aligning DNA Sequences , 2000, J. Comput. Biol..

[69]  Gabor T. Marth,et al.  A general approach to single-nucleotide polymorphism discovery , 1999, Nature Genetics.

[70]  P. Green,et al.  Consed: a graphical tool for sequence finishing. , 1998, Genome research.

[71]  K. Edwards,et al.  Microsatellite libraries enriched for several microsatellite sequences in plants. , 1996, BioTechniques.

[72]  Eugene W. Myers,et al.  Toward Simplifying and Accurately Formulating Fragment Assembly , 1995, J. Comput. Biol..