Entropy-scaling search of massive biological data
暂无分享,去创建一个
Bonnie Berger | Noah M. Daniels | Yun William Yu | David Christian Danko | B. Berger | Y. Yu | D. Danko
[1] C. Huttenhower,et al. Metagenomic microbial community profiling using unique clade-specific marker genes , 2012, Nature Methods.
[2] Divyakant Agrawal,et al. Vector approximation based indexing for non-uniform high dimensional data sets , 2000, CIKM '00.
[3] M. David,et al. Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw , 2011, Nature.
[4] J. Banfield,et al. Community structure and metabolism through reconstruction of microbial genomes from the environment , 2004, Nature.
[5] Rachel S. G. Sealfon,et al. Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak , 2014, Science.
[6] Hans-Jörg Schek,et al. A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.
[7] E. Myers,et al. Basic local alignment search tool. , 1990, Journal of molecular biology.
[8] Terence Tao. Product set estimates for non-commutative groups , 2008, Comb..
[9] Chao Xie,et al. Fast and sensitive protein alignment using DIAMOND , 2014, Nature Methods.
[10] Esko Ukkonen,et al. Algorithms for Approximate String Matching , 1985, Inf. Control..
[11] Yongan Zhao,et al. RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data , 2011, Bioinform..
[12] Satu Elisa Schaeffer,et al. Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.
[13] N. Pace,et al. Gastrointestinal microbiology enters the metagenomics era , 2008, Current opinion in gastroenterology.
[14] Sergey Nepomnyachiy,et al. Global view of the protein universe , 2014, Proceedings of the National Academy of Sciences.
[15] Gregory D. Schuler,et al. Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.
[16] Inbal Budowski-Tal,et al. FragBag, an accurate representation of protein structure, retrieves structural neighbors from the entire PDB quickly and accurately , 2010, Proceedings of the National Academy of Sciences.
[17] Giovanni Manzini,et al. Opportunistic data structures with applications , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.
[18] Bonnie Berger,et al. Quality score compression improves genotyping accuracy , 2015, Nature Biotechnology.
[19] Mona Singh,et al. Computational solutions for omics data , 2013, Nature Reviews Genetics.
[20] Pavel Zezula,et al. M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.
[21] Thomas C. Conway,et al. Succinct data structures for assembling large genomes , 2010, Bioinform..
[22] S. B. Needleman,et al. A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.
[23] R. Levy,et al. Simplified amino acid alphabets for protein fold recognition and implications for folding. , 2000, Protein engineering.
[24] Rob Phillips,et al. Reduced amino acid alphabets exhibit an improved sensitivity and selectivity in fold assignment , 2009, Bioinform..
[25] Roberto Grossi,et al. Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching , 2005, SIAM J. Comput..
[26] G. Kiczales,et al. Proceedings the , 1997 .
[27] Guy Joseph Jacobson,et al. Succinct static data structures , 1988 .
[28] N Linial,et al. ProtoMap: Automatic classification of protein sequences, a hierarchy of protein families, and local maps of the protein space , 1999, Proteins.
[29] Nathan Linial,et al. Recovering key biological constituents through sparse representation of gene expression , 2011, Bioinform..
[30] Chao Xie,et al. A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA , 2013, Bioinform..
[31] Mario Vento,et al. An Improved Algorithm for Matching Large Graphs , 2001 .
[32] Lenore Cowen,et al. Compressive genomics for protein databases , 2013, Bioinform..
[33] Scott D. Kahn. On the Future of Genomic Data , 2011, Science.
[34] David J. Wild,et al. Grand challenges for cheminformatics , 2009, J. Cheminformatics.
[35] Yanli Wang,et al. PubChem: Integrated Platform of Small Molecules and Biological Activities , 2008 .
[36] Charu C. Aggarwal,et al. Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.
[37] W. J. Kent,et al. BLAT--the BLAST-like alignment tool. , 2002, Genome research.
[38] Pavel Zezula,et al. Similarity Search - The Metric Space Approach , 2005, Advances in Database Systems.
[39] Pavel Zezula,et al. A cost model for similarity queries in metric spaces , 1998, PODS '98.
[40] Tatiana Tatusova,et al. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..
[41] Jeffrey K. Uhlmann,et al. Satisfying General Proximity/Similarity Queries with Metric Trees , 1991, Inf. Process. Lett..
[42] P E Bourne,et al. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.
[43] E. Jacoby,et al. Chemogenomics: an emerging strategy for rapid target and drug discovery , 2004, Nature Reviews Genetics.
[44] M. Levitt,et al. Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core , 1993, Current Biology.
[45] Kenneth Falconer,et al. Fractal Geometry: Mathematical Foundations and Applications , 1990 .
[46] Sahil R. Kalra,et al. Big Challenges? Big Data … , 2015 .
[47] Uri Alon,et al. Inferring biological tasks using Pareto analysis of high-dimensional data , 2015, Nature Methods.
[48] B. Berger,et al. Compressive genomics , 2012, Nature Biotechnology.
[49] Eric J Alm,et al. Host lifestyle affects human microbiota on daily timescales , 2014, Genome Biology.
[50] S. Schuster,et al. Integrative analysis of environmental sequences using MEGAN4. , 2011, Genome research.
[51] Lenore Cowen,et al. Matt: Local Flexibility Aids Protein Multiple Structure Alignment , 2008, PLoS Comput. Biol..
[52] Jesse R. Zaneveld,et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences , 2013, Nature Biotechnology.
[53] Tatiana A. Tatusova,et al. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..
[54] Horst Bunke,et al. A graph distance metric based on the maximal common subgraph , 1998, Pattern Recognit. Lett..
[55] D. Macfabe,et al. Short-chain fatty acid fermentation products of the gut microbiome: implications in autism spectrum disorders , 2012, Microbial ecology in health and disease.
[56] Xavier Llorà,et al. Automated alphabet reduction method with evolutionary algorithms for protein structure prediction , 2007, GECCO '07.
[57] M. Gerstein,et al. The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. , 1999, Journal of molecular biology.
[58] Piotr Indyk,et al. Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.
[59] Rainer Schrader,et al. Small Molecule Subgraph Detector (SMSD) toolkit , 2009, J. Cheminformatics.
[60] Piotr Indyk,et al. Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..
[61] N Linial,et al. Global self-organization of all known protein sequences reveals inherent biological signatures. , 1997, Journal of molecular biology.
[62] Hiroyuki Ogata,et al. KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..
[63] Tao Jiang,et al. A maximum common substructure-based algorithm for searching and predicting drug-like compounds , 2008, ISMB.
[64] V. Marx. Biology: The big challenges of big data , 2013, Nature.