GenomePeek—an online tool for prokaryotic genome and metagenome analysis

As more and more prokaryotic sequencing takes place, a method to quickly and accurately analyze this data is needed. Previous tools are mainly designed for metagenomic analysis and have limitations; such as long runtimes and significant false positive error rates. The online tool GenomePeek (edwards.sdsu.edu/GenomePeek) was developed to analyze both single genome and metagenome sequencing files, quickly and with low error rates. GenomePeek uses a sequence assembly approach where reads to a set of conserved genes are extracted, assembled and then aligned against the highly specific reference database. GenomePeek was found to be faster than traditional approaches while still keeping error rates low, as well as offering unique data visualization options.

[1]  J. T. Dunnen,et al.  Next generation sequencing technology: Advances and applications. , 2014, Biochimica et biophysica acta.

[2]  Bas E. Dutilh,et al.  FOCUS: an alignment-free model to identify organisms in metagenomes using non-negative least squares , 2014, PeerJ.

[3]  Florent E. Angly,et al.  CopyRighter: a rapid tool for improving the accuracy of microbial community profiles through lineage-specific gene copy number correction , 2014, Microbiome.

[4]  Fangfang Xia,et al.  The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST) , 2013, Nucleic Acids Res..

[5]  Massimo Deligios,et al.  Evaluating the Impact of Different Sequence Databases on Metaproteome Analysis: Insights from a Lab-Assembled Microbial Mixture , 2013, PloS one.

[6]  Elhanan Borenstein,et al.  Reconstructing the Genomic Content of Microbiome Taxa through Shotgun Metagenomic Deconvolution , 2013, PLoS Comput. Biol..

[7]  Monzoorul Haque Mohammed,et al.  Classification of metagenomic sequences: methods and challenges , 2012, Briefings Bioinform..

[8]  Bas E. Dutilh,et al.  Taxonomic and Functional Microbial Signatures of the Endemic Marine Sponge Arenosclera brasiliensis , 2012, PloS one.

[9]  C. Huttenhower,et al.  Metagenomic microbial community profiling using unique clade-specific marker genes , 2012, Nature Methods.

[10]  Katherine H. Huang,et al.  A framework for human microbiome research , 2012, Nature.

[11]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[12]  Florent E. Angly,et al.  Grinder: a versatile amplicon and shotgun sequence simulator , 2012, Nucleic acids research.

[13]  M. Pignatelli,et al.  The oral metagenome in health and disease , 2011, The ISME Journal.

[14]  Peter Meinicke,et al.  Mixture models for analysis of the taxonomic composition of metagenomes , 2011, Bioinform..

[15]  R. Edwards,et al.  Fast Identification and Removal of Sequence Contamination from Genomic and Metagenomic Datasets , 2011, PloS one.

[16]  R. O’Neill,et al.  Abundant Human DNA Contamination Identified in Non-Primate Genome Databases , 2011, PloS one.

[17]  Ying Cheng,et al.  The European Nucleotide Archive , 2010, Nucleic Acids Res..

[18]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[19]  C. Chiou,et al.  Application of recA and rpoB sequence analysis on phylogeny and molecular identification of Geobacillus species , 2009, Journal of applied microbiology.

[20]  Andreas Wilke,et al.  phylogenetic and functional analysis of metagenomes , 2022 .

[21]  Rick L. Stevens,et al.  Functional metagenomic profiling of nine biomes , 2008, Nature.

[22]  Jaysheel D. Bhavsar,et al.  Metagenomics: Read Length Matters , 2008, Applied and Environmental Microbiology.

[23]  W. König,et al.  Genetic Classification and Distinguishing of Staphylococcus Species Based on Different Partial gap, 16S rRNA, hsp60, rpoB, sodA, and tuf Gene Sequences , 2008, Journal of Clinical Microbiology.

[24]  Rick L. Stevens,et al.  The RAST Server: Rapid Annotations using Subsystems Technology , 2008, BMC Genomics.

[25]  A. Salamov,et al.  Use of simulated data sets to evaluate the fidelity of metagenomic processing methods , 2007, Nature Methods.

[26]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[27]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[28]  Samuel Karlin,et al.  Protein length in eukaryotic and prokaryotic proteomes , 2005, Nucleic acids research.

[29]  Tatiana A. Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[30]  M. Drancourt,et al.  Dissection of phylogenetic relationships among 19 rapidly growing Mycobacterium species by 16S rRNA, hsp65, sodA, recA and rpoB gene sequencing. , 2004, International journal of systematic and evolutionary microbiology.

[31]  Rick L. Stevens,et al.  The SEED: a peer-to-peer environment for genome annotation , 2004, CACM.

[32]  Derek R Lovley,et al.  Comparison of 16S rRNA, nifD, recA, gyrB, rpoB and fusA genes within the family Geobacteraceae fam. nov. , 2004, International journal of systematic and evolutionary microbiology.

[33]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[34]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[35]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology , 2003, Nucleic Acids Res..

[36]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[37]  X. Huang,et al.  CAP3: A DNA sequence assembly program. , 1999, Genome research.

[38]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[39]  A. Shinohara,et al.  Rad51 protein involved in repair and recombination in S. cerevisiae is a RecA-like protein , 1992, Cell.

[40]  R. Gupta,et al.  Nucleotide sequence of mouse HSP60 (chaperonin, GroEL homolog) cDNA. , 1990, Biochimica et biophysica acta.

[41]  R. Young,et al.  Prokaryotic and eukaryotic RNA polymerases have homologous core subunits. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[42]  N. Pace,et al.  Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[43]  C. Woese,et al.  Phylogenetic structure of the prokaryotic domain: The primary kingdoms , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[44]  N. Scott Ribosomal RNA cistrons in Euglena gracilis. , 1973, Journal of molecular biology.