Benchmarking of alignment-free sequence comparison methods
暂无分享,去创建一个
Matteo Comin | Jonas S. Almeida | Burkhard Morgenstern | Fengzhu Sun | Michael S. Waterman | Wojciech M. Karlowski | Benjamin T James | Susana Vinga | Cheong Xin Chan | Hani Z. Girgis | Chris-Andre Leimeister | Thomas Dencker | Kujin Tang | Anna Katharina Lau | Sophie Röhling | Jae Jin Choi | Andrzej Zielezinski | M. Waterman | Fengzhu Sun | B. Morgenstern | S. Vinga | W. Karłowski | C. Chan | JaeJin Choi | M. Comin | A. Zielezinski | Guillaume Bernard | Benjamin T. James | Guillaume Bernard | Chris-André Leimeister | Kujin Tang | Thomas Dencker | A. Lau | S. Röhling | Sung-Hou Kim | Sung-Hou Kim | Sophie Röhling | Jae Jin Choi | Jae Jin Choi | Jae Jin Choi | Andrzej Zielezinski
[1] Se-Ran Jun,et al. Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions , 2009, Proceedings of the National Academy of Sciences.
[2] Rui Dong,et al. Positional Correlation Natural Vector: A Novel Method for Genome Comparison , 2020, International journal of molecular sciences.
[3] H. J. Jeffrey. Chaos game representation of gene structure. , 1990, Nucleic acids research.
[4] Timothy J. Harlow,et al. Highways of gene sharing in prokaryotes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.
[5] Sung-Hou Kim,et al. Whole-genome phylogeny of Escherichia coli/Shigella group by feature frequency profiles (FFPs) , 2011, Proceedings of the National Academy of Sciences.
[6] Ioannis Xenarios,et al. Taxon sampling unequally affects individual nodes in a phylogenetic tree: consequences for model gene tree construction in SwissTree , 2017, bioRxiv.
[7] Xiangde Zhang,et al. Alignment free comparison: similarity distribution between the DNA primary sequences based on the shortest absent word. , 2012, Journal of theoretical biology.
[8] Burkhard Morgenstern,et al. The number of k-mer matches between two DNA sequences as a function of k and applications to estimate phylogenetic distances , 2020, PloS one.
[9] Patrice Koehl,et al. The ASTRAL compendium for protein structure and sequence analysis , 2000, Nucleic Acids Res..
[10] Nuno A. Fonseca,et al. Assemblathon 1: a competitive assessment of de novo short read assembly methods. , 2011, Genome research.
[11] Cheng Soon Ong,et al. kWIP: The k-mer weighted inner product, a de novo estimator of genetic similarity , 2016, bioRxiv.
[12] I. Miklós,et al. Dynamics of Genome Rearrangement in Bacterial Populations , 2008, PLoS genetics.
[13] Yanchun Yang,et al. Markov model plus k-word distributions: a synergy that produces novel statistical measures for sequence comparison , 2008, Bioinform..
[14] Burkhard Morgenstern,et al. Prot-SpaM: fast alignment-free phylogeny reconstruction based on whole-proteome sequences , 2019, GigaScience.
[15] M. Kuhner,et al. Practical performance of tree comparison metrics. , 2015, Systematic biology.
[16] Patrice Koehl,et al. The ASTRAL Compendium in 2004 , 2003, Nucleic Acids Res..
[17] Se-Ran Jun,et al. Whole-proteome phylogeny of prokaryotes by feature frequency profiles: An alignment-free method with optimal feature resolution , 2009, Proceedings of the National Academy of Sciences.
[18] Alexandros Stamatakis,et al. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..
[19] H. W. Parker,et al. Systematic Zoology , 1896, Nature.
[20] Jonas S. Almeida,et al. Alignment-free sequence comparison: benefits, applications, and tools , 2017, Genome Biology.
[21] Bernhard Haubold,et al. Alignment-free phylogenetics and population genetics , 2014, Briefings Bioinform..
[22] W. Martin,et al. Getting a better picture of microbial evolution en route to a network of genomes , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.
[23] Thomas Wiehe,et al. Estimating Mutation Distances from Unaligned Genomes , 2009, J. Comput. Biol..
[24] Paul Greenfield,et al. k-mer Similarity, Networks of Microbial Genomes, and Taxonomic Rank , 2017, mSystems.
[25] Xin Chen,et al. Comparison of next-generation sequencing samples using compression-based distances and its application to phylogenetic reconstruction , 2014, BMC Research Notes.
[26] Brian D. Ondov,et al. Mash: fast genome and metagenome distance estimation using MinHash , 2015, Genome Biology.
[27] Steven E. Brenner,et al. SCOPe: Structural Classification of Proteins—extended, integrating SCOP and ASTRAL data and classification of new structures , 2013, Nucleic Acids Res..
[28] Bin Ma,et al. Patternhunter Ii: Highly Sensitive and Fast Homology Search , 2004, J. Bioinform. Comput. Biol..
[29] Matteo Comin,et al. Alignment-free phylogeny of whole genomes using underlying subwords , 2012, Algorithms for Molecular Biology.
[30] Adrian M. Altenhoff,et al. Standardized benchmarking in the quest for orthologs , 2016, Nature Methods.
[31] C R Woese,et al. Classification of methanogenic bacteria by 16S ribosomal RNA characterization. , 1977, Proceedings of the National Academy of Sciences of the United States of America.
[32] M. Ragan,et al. Inferring phylogenies of evolving sequences without multiple sequence alignment , 2014, Scientific Reports.
[33] Anthony R. Ives,et al. An assembly and alignment-free method of phylogeny reconstruction from next-generation sequencing data , 2015, BMC Genomics.
[34] Huiguang Yi,et al. Co-phylog: an assembly-free phylogenomic approach for closely related organisms , 2010, Nucleic acids research.
[35] Bruno Bauwens,et al. LZW-Kernel: fast kernel utilizing variable length code blocks from LZW compressors for protein sequence classification , 2018, Bioinform..
[36] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[37] Winston Hide,et al. Biological Evaluation of d2, an Algorithm for High-Performance Sequence Comparison , 1994, J. Comput. Biol..
[38] Pandurang Kolekar,et al. Alignment-free distance measure based on return time distribution for sequence analysis: applications to clustering, molecular phylogeny and subtyping. , 2012, Molecular phylogenetics and evolution.
[39] Chenhui Yang,et al. An estimator for local analysis of genome based on the minimal absent word. , 2016, Journal of theoretical biology.
[40] M. Ragan,et al. A novel alignment-free method for detection of lateral genetic transfer based on TF-IDF , 2016, Scientific Reports.
[41] Sung-Hou Kim,et al. A genome Tree of Life for the Fungi kingdom , 2017, Proceedings of the National Academy of Sciences.
[42] Satish Rao,et al. Quartet MaxCut: a fast algorithm for amalgamating quartet trees. , 2012, Molecular phylogenetics and evolution.
[43] Jed A. Fuhrman,et al. CAFE: aCcelerated Alignment-FrEe sequence analysis , 2017, Nucleic Acids Res..
[44] Burkhard Morgenstern,et al. The number of k-mer matches between two DNA sequences as a function of k and applications to estimate phylogenetic distances , 2019, bioRxiv.
[45] Gesine Reinert,et al. Alignment-Free Sequence Analysis and Applications. , 2018, Annual review of biomedical data science.
[46] Cheong Xin Chan,et al. Recapitulating phylogenies using k-mers: from trees to networks , 2016, F1000Research.
[47] Inanç Birol,et al. Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species , 2013, GigaScience.
[48] Tom Slezak,et al. kSNP3.0: SNP detection and phylogenetic analysis of genomes without genome alignment or reference genome , 2015, Bioinform..
[49] Susana Vinga,et al. Information theory applications for biological sequence analysis , 2013, Briefings Bioinform..
[50] T. Warnow,et al. Unblended disjoint tree merging using GTM improves species tree estimation , 2020, BMC Genomics.
[51] Marc S Halfon,et al. Computational discovery of cis-regulatory modules in Drosophila without prior knowledge of motifs , 2008, Genome Biology.
[52] M. Ragan,et al. Next-generation phylogenomics , 2013, Biology Direct.
[53] Dhundy Bastola,et al. Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis , 2014, Briefings Bioinform..
[54] David Burstein,et al. The Average Common Substring Approach to Phylogenomic Reconstruction , 2006, J. Comput. Biol..
[55] Bin Ma,et al. PatternHunter: faster and more sensitive homology search , 2002, Bioinform..
[56] Martin R. Smith,et al. Bayesian and parsimony approaches reconstruct informative trees from simulated morphological datasets , 2019, Biology Letters.
[57] Eric Bapteste,et al. INAUGURAL ARTICLE by a Recently Elected Academy Member:Pattern pluralism and the Tree of Life hypothesis , 2007 .
[58] Erich Bornberg-Bauer,et al. Rapid similarity search of proteins using alignments of domain arrangements , 2014, Bioinform..
[59] Leping Li,et al. ART: a next-generation sequencing read simulator , 2012, Bioinform..
[60] Yongchao Liu,et al. A greedy alignment-free distance estimator for phylogenetic inference , 2015, 2015 IEEE 5th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS).
[61] Diogo Pratas,et al. Smash++: an alignment-free and memory-efficient tool to find genomic rearrangements , 2020, GigaScience.
[62] Changchuan Yin,et al. An improved model for whole genome phylogenetic analysis by Fourier transform. , 2015, Journal of theoretical biology.
[63] Bernhard Haubold,et al. andi: Fast and accurate estimation of evolutionary distances between closely related genomes , 2015, Bioinform..
[64] Vineet Bafna,et al. Skmer: assembly-free and alignment-free sample identification using genome skims , 2019, Genome Biology.
[65] D. Robinson,et al. Comparison of phylogenetic trees , 1981 .
[66] E. Myers,et al. Basic local alignment search tool. , 1990, Journal of molecular biology.
[67] Burkhard Morgenstern,et al. Fast alignment-free sequence comparison using spaced-word frequencies , 2014, Bioinform..
[68] Sagi Snir,et al. Multi-SpaM: A Maximum-Likelihood Approach to Phylogeny Reconstruction Using Multiple Spaced-Word Matches and Quartet Trees , 2018, RECOMB-CG.
[69] Jonas S. Almeida,et al. Entropic Profiler – detection of conservation in genomes using information theory , 2009, BMC Research Notes.
[70] Chenglong Yu,et al. A protein map and its application. , 2008, DNA and cell biology.
[71] Robert G. Beiko,et al. A simulation test bed for hypotheses of genome evolution , 2007, Bioinform..
[72] I. Longden,et al. EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.
[73] Burkhard Morgenstern,et al. The number of k-mer matches between two DNA sequences as a function of k and applications to estimate phylogenetic distances , 2019, bioRxiv.
[74] Saurabh Sinha,et al. A statistical method for alignment-free comparison of regulatory sequences , 2007, ISMB/ECCB.
[75] Gerhard G. Thallinger,et al. Complete Mitochondrial DNA Sequences of the Threadfin Cichlid (Petrochromis trewavasae) and the Blunthead Cichlid (Tropheus moorii) and Patterns of Mitochondrial Genome Evolution in Cichlid Fishes , 2013, PloS one.
[76] P. Bork,et al. ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data , 2016, Molecular biology and evolution.
[77] Jonas S. Almeida,et al. Alignment-free sequence comparison-a review , 2003, Bioinform..
[78] Kai Song,et al. New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing , 2014, Briefings Bioinform..
[79] K. Hatje,et al. A Phylogenetic Analysis of the Brassicales Clade Based on an Alignment-Free Sequence Comparison Method , 2012, Front. Plant Sci..
[80] Burkhard Morgenstern,et al. kmacs: the k-mismatch average common substring approach to alignment-free sequence comparison , 2014, Bioinform..
[81] Burkhard Morgenstern,et al. Fast and accurate phylogeny reconstruction using filtered spaced-word matches , 2017, Bioinform..
[82] Jonas S. Almeida,et al. Analysis of genomic sequences by Chaos Game Representation , 2001, Bioinform..
[83] Philip D. Blood,et al. Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software , 2017, Nature Methods.
[84] F. Balloux,et al. Large-scale network analysis captures biological features of bacterial plasmids , 2020, Nature Communications.
[85] Benjamin T James,et al. A survey and evaluations of histogram-based statistics in alignment-free sequence comparison , 2017, Briefings Bioinform..
[86] Jianhua Lin,et al. Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.
[87] Matteo Comin,et al. Benchmarking of alignment-free sequence comparison methods , 2019 .
[88] Mark A Ragan,et al. Within-species lateral genetic transfer and the evolution of transcriptional regulation in Escherichia coli and Shigella , 2011, BMC Genomics.
[89] James M. Hogan,et al. Alignment-free inference of hierarchical and reticulate phylogenomic relationships , 2017, Briefings Bioinform..
[90] Mark A. Ragan,et al. Alignment-free microbial phylogenomics under scenarios of sequence divergence, genome rearrangement and lateral genetic transfer , 2016, Scientific Reports.
[91] Gesine Reinert,et al. Alignment-Free Sequence Comparison (II): Theoretical Power of Comparison Statistics , 2010, J. Comput. Biol..
[92] Burkhard Morgenstern,et al. Read-SpaM: assembly-free and alignment-free comparison of bacterial genomes with low sequencing coverage , 2019, BMC Bioinformatics.
[93] Xin Chen,et al. An information-based sequence distance and its application to whole mitochondrial genome phylogeny , 2001, Bioinform..
[94] D. Davison,et al. A measure of DNA sequence dissimilarity based on Mahalanobis distance between frequencies of words. , 1997, Biometrics.
[95] Eun Ji Kim,et al. Simulation-based comprehensive benchmarking of RNA-seq aligners , 2016, Nature Methods.
[96] S. Carroll,et al. Genome-scale approaches to resolving incongruence in molecular phylogenies , 2003, Nature.
[97] Hilde van der Togt,et al. Publisher's Note , 2003, J. Netw. Comput. Appl..
[98] David Haussler,et al. Alignathon: a competitive assessment of whole-genome alignment methods , 2014, bioRxiv.
[99] Matteo Comin,et al. Fast Entropic Profiler: An Information Theoretic Approach for the Discovery of Patterns in Genomes , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.
[100] Donald A. Adjeroh,et al. K2 and K2*: efficient alignment‐free sequence similarity measurement based on Kendall statistics , 2018, Bioinform..
[101] Burkhard Morgenstern,et al. Estimating evolutionary distances between genomic sequences from spaced-word matches , 2015, Algorithms for Molecular Biology.
[102] Klas Hatje,et al. Spaced words and kmacs: fast alignment-free sequence comparison based on inexact word matches , 2014, Nucleic Acids Res..
[103] B. Blaisdell. A measure of the similarity of sets of sequences not requiring sequence alignment. , 1986, Proceedings of the National Academy of Sciences of the United States of America.
[104] J. Thompson,et al. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.
[105] Jonas S. Almeida,et al. Sequence analysis by iterated maps, a review , 2014, Briefings Bioinform..
[106] Gesine Reinert,et al. Alignment-Free Sequence Comparison (I): Statistics and Power , 2009, J. Comput. Biol..
[107] Jonas S. Almeida,et al. Comparative evaluation of word composition distances for the recognition of SCOP relationships , 2004, Bioinform..
[108] Fred R. McMorris,et al. COMPARISON OF UNDIRECTED PHYLOGENETIC TREES BASED ON SUBTREES OF FOUR EVOLUTIONARY UNITS , 1985 .
[109] Matteo Comin,et al. On the comparison of regulatory sequences with multiple resolution Entropic Profiles , 2016, BMC Bioinformatics.