SpaRC: Scalable Sequence Clustering using Apache Spark
暂无分享,去创建一个
Xiandong Meng | Zhong Wang | Michael Mascagni | Lizhen Shi | Elizabeth Tseng | M. Mascagni | Zhong Wang | E. Tseng | Xiandong Meng | Lizhen Shi | Elizabeth Tseng
[1] Philip D. Blood,et al. Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software , 2017, Nature Methods.
[2] Reynold Xin,et al. GraphX: a resilient distributed graph system on Spark , 2013, GRADES.
[3] Sebastian Deorowicz,et al. KMC 2: Fast and resource-frugal k-mer counting , 2014, Bioinform..
[4] Zhong Wang,et al. Next-generation transcriptome assembly , 2011, Nature Reviews Genetics.
[5] Carl Kingsford,et al. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers , 2011, Bioinform..
[6] Kunihiko Sadakane,et al. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph , 2014, Bioinform..
[7] Xiandong Meng,et al. A case study of tuning MapReduce for efficient Bioinformatics in the cloud , 2017, Parallel Comput..
[8] Luis Pedro Coelho,et al. Structure and function of the global ocean microbiome , 2015, Science.
[9] Aart J. C. Bik,et al. Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.
[10] Max Klein,et al. Biospark: scalable analysis of large numerical datasets from biological simulations and experiments using Hadoop and Spark , 2017, Bioinform..
[11] Huzefa Rangwala,et al. A Map-Reduce Framework for Clustering Metagenomes , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[12] David A. Patterson,et al. ADAM: Genomics Formats and Processing Patterns for Cloud Scale Computing , 2013 .
[13] Katherine H. Huang,et al. Detection of low-abundance bacterial strains in metagenomic datasets by eigengenome partitioning , 2015, Nature Biotechnology.
[14] Reynold Xin,et al. GraphFrames: an integrated API for mixing graph and relational queries , 2016, GRADES '16.
[15] Siu-Ming Yiu,et al. MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample , 2012, Bioinform..
[16] Leonid Oliker,et al. HipMer: an extreme-scale de novo genome assembler , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[17] Xiandong Meng,et al. A near complete snapshot of the Zea mays seedling transcriptome revealed from ultra-deep sequencing , 2014, Scientific Reports.
[18] J. Hughes,et al. Counting the Uncountable: Statistical Approaches to Estimating Microbial Diversity , 2001, Applied and Environmental Microbiology.
[19] Ümit V. Çatalyürek,et al. Spaler: Spark and GraphX based de novo genome assembler , 2015, 2015 IEEE International Conference on Big Data (Big Data).
[20] P. Pevzner,et al. metaSPAdes: a new versatile metagenomic assembler. , 2017, Genome research.
[21] Frank Mueller,et al. SparkScore: Leveraging Apache Spark for Distributed Genomic Inference , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[22] Heng Li,et al. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences , 2015, Bioinform..
[23] WangJianxin,et al. DIME: A Novel Framework for De Novo Metagenomic Sequence Assembly , 2015 .
[24] Jan-Fang Cheng,et al. Next generation sequencing data of a defined microbial mock community , 2016, Scientific Data.
[25] S. Tringe,et al. Metagenomic Discovery of Biomass-Degrading Genes and Genomes from Cow Rumen , 2011, Science.
[26] Réka Albert,et al. Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.
[27] Stephen C. J. Parker,et al. Accurate and comprehensive sequencing of personal genomes. , 2011, Genome research.
[28] Joseph K. Bradley,et al. Spark SQL: Relational Data Processing in Spark , 2015, SIGMOD Conference.
[29] Ralph Roskies,et al. Bridges: a uniquely flexible HPC resource for new communities and data analytics , 2015, XSEDE.
[30] Veli Mäkinen,et al. A framework for space-efficient read clustering in metagenomic samples , 2017, BMC Bioinformatics.
[31] S. Tringe,et al. Tackling soil diversity with the assembly of large, complex metagenomes , 2014, Proceedings of the National Academy of Sciences.
[32] Xiandong Meng,et al. Widespread Polycistronic Transcripts in Fungi Revealed by Single-Molecule mRNA Sequencing , 2015, PloS one.
[33] Axel Visel,et al. the sheep rumen microbiome Methane yield phenotypes linked to differential gene expression in , 2014 .
[34] S. Koren,et al. Assembly algorithms for next-generation sequencing data. , 2010, Genomics.
[35] Alberto M. R. Dávila,et al. SparkBLAST: scalable BLAST processing using in-memory operations , 2017, BMC Bioinformatics.
[36] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.
[37] Yi Pan,et al. DIME: A Novel Framework for De Novo Metagenomic Sequence Assembly , 2015, J. Comput. Biol..
[38] Dominique Lavenier,et al. DSK: k-mer counting with very low memory usage , 2013, Bioinform..
[39] Edward M. Rubin,et al. Metagenomics: DNA sequencing of environmental samples , 2005, Nature Reviews Genetics.
[40] Xingjian Xu,et al. CloudPhylo: a fast and scalable tool for phylogeny reconstruction. , 2016, Bioinformatics.