Ultrafast clustering algorithms for metagenomic sequence analysis
暂无分享,去创建一个
John C. Wooley | Sitao Wu | Beifang Niu | Weizhong Li | Limin Fu | Sitao Wu | B. Niu | L. Fu | Weizhong Li | J. Wooley | Beifang Niu
[1] D. Davison,et al. d2_cluster: a validated method for clustering EST and full-length cDNAsequences. , 1999, Genome research.
[2] C. Quince,et al. Accurate determination of microbial diversity from 454 pyrosequencing data , 2009, Nature Methods.
[3] Gregory D. Schuler,et al. ESTablishing a human transcript map , 1995, Nature Genetics.
[4] R. Knight,et al. Bacterial Community Variation in Human Body Habitats Across Space and Time , 2009, Science.
[5] Russell J. Davenport,et al. Removing Noise From Pyrosequenced Amplicons , 2011, BMC Bioinformatics.
[6] Adam Godzik,et al. Tolerating some redundancy significantly speeds up clustering of large protein databases , 2002, Bioinform..
[7] John Quackenbush,et al. TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets , 2003, Bioinform..
[8] P. Pevzner,et al. Efficient de novo assembly of single-cell bacterial genomes from short-read data sets , 2011, Nature Biotechnology.
[9] E. Birney,et al. Pfam: the protein families database , 2013, Nucleic Acids Res..
[10] A. Halpern,et al. The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific , 2007, PLoS biology.
[11] William G. Mckendree,et al. ESPRIT: estimating species richness using large collections of 16S rRNA pyrosequences , 2009, Nucleic acids research.
[12] Hiroyuki Ogata,et al. KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..
[13] Tao Jiang,et al. SEED: efficient clustering of next-generation sequences , 2011, Bioinform..
[14] Weizhong Li,et al. Analysis and comparison of very large metagenomes with fast clustering and functional annotation , 2009, BMC Bioinformatics.
[15] Anton J. Enright,et al. GeneRAGE: a robust algorithm for sequence clustering and domain detection , 2000, Bioinform..
[16] Martin Hartmann,et al. Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.
[17] Ying Gao,et al. Bioinformatics Applications Note Sequence Analysis Cd-hit Suite: a Web Server for Clustering and Comparing Biological Sequences , 2022 .
[18] A. Godzik,et al. Probing Metagenomics by Rapid Cluster Analysis of Very Large Datasets , 2008, PloS one.
[19] Folker Meyer,et al. 37. The Metagenomics RAST Server: A Public Resource for the Automatic Phylogenetic and Functional Analysis of Metagenomes , 2011 .
[20] Eoin L. Brodie,et al. Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB , 2006, Applied and Environmental Microbiology.
[21] Sitao Wu,et al. WebMGA: a customizable web server for fast metagenomic sequence analysis , 2011, BMC Genomics.
[22] B. Roe,et al. A core gut microbiome in obese and lean twins , 2008, Nature.
[23] Alexander Schliep,et al. ProClust: improved clustering of protein sequences with an extended graph-based approach , 2002, ECCB.
[24] Robert C. Edgar,et al. BIOINFORMATICS APPLICATIONS NOTE , 2001 .
[25] William A. Walters,et al. QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.
[26] J. Handelsman,et al. Introducing DOTUR, a Computer Program for Defining Operational Taxonomic Units and Estimating Species Richness , 2005, Applied and Environmental Microbiology.
[27] A. Godzik,et al. Sequence clustering strategies improve remote homology recognitions while reducing search times. , 2002, Protein engineering.
[28] Adam Godzik,et al. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..
[29] P. Bork,et al. A human gut microbial gene catalogue established by metagenomic sequencing , 2010, Nature.
[30] Mihai Pop,et al. DNACLUST: accurate and efficient clustering of phylogenetic marker genes , 2011, BMC Bioinformatics.
[31] W. J. Kent,et al. BLAT--the BLAST-like alignment tool. , 2002, Genome research.
[32] Adam Godzik,et al. Clustering of highly homologous sequences to reduce the size of large protein databases , 2001, Bioinform..
[33] T. Takagi,et al. MetaGene: prokaryotic gene finding from environmental genome shotgun sequences , 2006, Nucleic acids research.
[34] S. Morishita,et al. Efficient frequency-based de novo short-read clustering for error trimming in next-generation sequencing. , 2009, Genome research.
[35] Benjamin J. Raphael,et al. The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families , 2007, PLoS biology.
[36] Sean R Eddy,et al. A new generation of homology search tools based on probabilistic inference. , 2009, Genome informatics. International Conference on Genome Informatics.
[37] Xiaoyu Wang,et al. A large-scale benchmark study of existing algorithms for taxonomy-independent microbial community analysis , 2012, Briefings Bioinform..
[38] Jing Chen,et al. Community cyberinfrastructure for Advanced Microbial Ecology Research and Analysis: the CAMERA resource , 2010, Nucleic Acids Res..
[39] Inge Jonassen,et al. Fast Sequence Clustering Using A Suffix Array Algorithm , 2003, Bioinform..
[40] Lu Wang,et al. The NIH Human Microbiome Project. , 2009, Genome research.
[41] Andreas Wilke,et al. phylogenetic and functional analysis of metagenomes , 2022 .
[42] Richard Durbin,et al. Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .
[43] Liisa Holm,et al. RSDB: representative protein sequence databases have high information content , 2000, Bioinform..
[44] Ori Sasson,et al. ProtoNet: hierarchical classification of the protein space , 2003, Nucleic Acids Res..
[45] John C. Wooley,et al. A Primer on Metagenomics , 2010, PLoS Comput. Biol..
[46] W. Ludwig,et al. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB , 2007, Nucleic acids research.
[47] Susan M. Huse,et al. Ironing out the wrinkles in the rare biosphere through improved OTU clustering , 2010, Environmental microbiology.
[48] Elaine R. Mardis,et al. A decade’s perspective on DNA sequencing technology , 2011, Nature.
[49] Limin Fu,et al. Artificial and natural duplicates in pyrosequencing reads of metagenomic data , 2010, BMC Bioinformatics.
[50] Shibu Yooseph,et al. Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering , 2007, BMC Bioinformatics.
[51] Paul Medvedev,et al. Error correction of high-throughput sequencing datasets with non-uniform coverage , 2011, Bioinform..
[52] Zsuzsanna Lipták,et al. KABOOM! A new suffix array based algorithm for clustering expression data , 2011, Bioinform..
[53] R. Knight,et al. Rapid denoising of pyrosequencing amplicon data: exploiting the rank-abundance distribution , 2010, Nature Methods.
[54] William R. Taylor,et al. Association of nucleotide patterns with gene function classes: application to human 3' untranslated sequences , 2002, Bioinform..
[55] M. Pop,et al. Metagenomic Analysis of the Human Distal Gut Microbiome , 2006, Science.
[56] Winston Hide,et al. CLU: A new algorithm for EST clustering , 2005, BMC Bioinformatics.
[57] Huanming Yang,et al. De novo assembly of human genomes with massively parallel short read sequencing. , 2010, Genome research.
[58] Susumu Goto,et al. KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..
[59] E. Birney,et al. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.
[60] Zhengwei Zhu,et al. FR-HIT, a very fast program to recruit metagenomic reads to homologous reference genomes , 2011, Bioinform..
[61] J. Handelsman. Metagenomics: Application of Genomics to Uncultured Microorganisms , 2004, Microbiology and Molecular Biology Reviews.
[62] Peter B. McGarvey,et al. UniRef: comprehensive and non-redundant UniProt reference clusters , 2007, Bioinform..
[63] A. Godzik,et al. Comparison of sequence profiles. Strategies for structural predictions using sequence information , 2008, Protein science : a publication of the Protein Society.
[64] Tracy K. Teal,et al. Systematic artifacts in metagenomes from complex microbial communities , 2009, The ISME Journal.
[65] Burkhard Rost,et al. UniqueProt: creating representative protein sequence sets , 2003, Nucleic Acids Res..
[66] S. Tringe,et al. Metagenomic Discovery of Biomass-Degrading Genes and Genomes from Cow Rumen , 2011, Science.
[67] Gayle M. Wittenberg,et al. EDAR: An Efficient Error Detection and Removal Algorithm for Next Generation Sequencing Data , 2010, J. Comput. Biol..
[68] S. Tringe,et al. Comparative Metagenomics of Microbial Communities , 2004, Science.
[69] C. Stoeckert,et al. OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.
[70] V. Kunin,et al. Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates. , 2009, Environmental microbiology.
[71] James R. Cole,et al. The Ribosomal Database Project: improved alignments and new tools for rRNA analysis , 2008, Nucleic Acids Res..
[72] O. White,et al. Environmental Genome Shotgun Sequencing of the Sargasso Sea , 2004, Science.
[73] L. Holm,et al. The Pfam protein families database , 2005, Nucleic Acids Res..
[74] J. Gilbert,et al. Detection of Large Numbers of Novel Sequences in the Metatranscriptomes of Complex Marine Microbial Communities , 2008, PloS one.
[75] Rick L. Stevens,et al. Functional metagenomic profiling of nine biomes , 2008, Nature.
[76] Andrew H. Chan,et al. ECHO: a reference-free short-read error correction algorithm. , 2011, Genome research.
[77] Elon Portugaly,et al. Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space , 2008, ISMB.
[78] Nathan Linial,et al. ProtoMap: automatic classification of protein sequences and hierarchy of protein families , 2000, Nucleic Acids Res..
[79] I-Min A. Chen,et al. IMG/M: a data management and analysis system for metagenomes , 2007, Nucleic Acids Res..
[80] Anton J. Enright,et al. An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.
[81] Siu-Ming Yiu,et al. SOAP2: an improved ultrafast tool for short read alignment , 2009, Bioinform..
[82] Haixu Tang,et al. RAPSearch: a fast protein similarity search tool for short reads , 2011, BMC Bioinformatics.