A benchmark study of sequence alignment methods for protein clustering
暂无分享,去创建一个
[1] J. Pei,et al. Multiple protein sequence alignment. , 2008, Current opinion in structural biology.
[2] Adam Godzik,et al. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..
[3] Michalis Vazirgiannis,et al. Clustering validity assessment: finding the optimal partitioning of a data set , 2001, Proceedings 2001 IEEE International Conference on Data Mining.
[4] W. Pearson. Rapid and sensitive sequence comparison with FASTP and FASTA. , 1990, Methods in enzymology.
[5] Amos Bairoch,et al. The PROSITE database , 2005, Nucleic Acids Res..
[6] Shmuel Pietrokovski,et al. Blocks+: a non-redundant database of protein alignment blocks derived from multiple compilations , 1999, Bioinform..
[7] Mark Johnson,et al. NCBI BLAST: a better web interface , 2008, Nucleic Acids Res..
[8] Ricardo J. G. B. Campello,et al. On the Comparison of Relative Clustering Validity Criteria , 2009, SDM.
[9] Gajendra P. S. Raghava,et al. OXBench: A benchmark for evaluation of protein multiple sequence alignment accuracy , 2003, BMC Bioinformatics.
[10] Christus,et al. A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .
[11] Jimin Pei,et al. PROMALS: towards accurate multiple sequence alignments of distantly related proteins , 2007, Bioinform..
[12] Michael Kaufmann,et al. DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment , 2008, Algorithms for Molecular Biology.
[13] Li Liao,et al. Combining Pairwise Sequence Similarity and Support Vector Machines for Detecting Remote Protein Evolutionary and Structural Relationships , 2003, J. Comput. Biol..
[14] Elisabeth R. M. Tillier,et al. The accuracy of several multiple sequence alignment programs for proteins , 2006, BMC Bioinformatics.
[15] J. Dunn. Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .
[16] W. J. Kent,et al. BLAT--the BLAST-like alignment tool. , 2002, Genome research.
[17] Christopher J. Lee,et al. Multiple sequence alignment using partial order graphs , 2002, Bioinform..
[18] M. Suchard,et al. Alignment Uncertainty and Genomic Analysis , 2008, Science.
[19] Chuong B. Do,et al. ProbCons: Probabilistic consistency-based multiple sequence alignment. , 2005, Genome research.
[20] Desmond G. Higgins,et al. Analysis and Comparison of Benchmarks for Multiple Sequence Alignment , 2006, Silico Biol..
[21] Michael Kaufmann,et al. BMC Bioinformatics BioMed Central , 2005 .
[22] P. Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .
[23] John P. Overington,et al. HOMSTRAD: A database of protein structure alignments for homologous families , 1998, Protein science : a publication of the Protein Society.
[24] Cédric Notredame,et al. 3DCoffee: combining protein sequences and structures within multiple sequence alignments. , 2004, Journal of molecular biology.
[25] Robert C. Edgar,et al. BIOINFORMATICS APPLICATIONS NOTE , 2001 .
[26] Tim J. P. Hubbard,et al. Data growth and its impact on the SCOP database: new developments , 2007, Nucleic Acids Res..
[27] Erik L. L. Sonnhammer,et al. Kalign – an accurate and fast multiple sequence alignment algorithm , 2005, BMC Bioinformatics.
[28] Robert C. Edgar,et al. Quality measures for protein alignment benchmarks , 2010, Nucleic acids research.
[29] Martin Hartmann,et al. Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.
[30] Fabrice Armougom,et al. Expresso: automatic incorporation of structural information in multiple sequence alignments using 3D-Coffee , 2006, Nucleic Acids Res..
[31] Jimin Pei,et al. AL2CO: calculation of positional conservation in a protein sequence alignment , 2001, Bioinform..
[32] Robert C. Edgar,et al. MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.
[33] Olivier Poch,et al. A comprehensive comparison of multiple sequence alignment programs , 1999, Nucleic Acids Res..
[34] Olivier Poch,et al. A Comprehensive Benchmark Study of Multiple Sequence Alignment Methods: Current Challenges and Future Perspectives , 2011, PloS one.
[35] Jérôme Gracy,et al. Automated protein sequence database classification. I. Integration of compositional similarity search, local similarity search, and multiple sequence alignment , 1998, Bioinform..
[36] Donald W. Bouldin,et al. A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[37] Olivier Poch,et al. A new protein linear motif benchmark for multiple sequence alignment software , 2008, BMC Bioinformatics.
[38] Maurits J. J. Dijkstra,et al. Multiple Sequence Alignment. , 2017, Methods in molecular biology.
[39] E. Myers,et al. Basic local alignment search tool. , 1990, Journal of molecular biology.
[40] D. Higgins,et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega , 2011, Molecular systems biology.
[41] Tu Minh Phuong,et al. Multiple alignment of protein sequences with repeats and rearrangements , 2006, Nucleic acids research.
[42] C. Notredame,et al. Recent progress in multiple sequence alignment: a survey. , 2002, Pharmacogenomics.
[43] Jérôme Gouzy,et al. The ProDom database of protein domain families , 1998, Nucleic Acids Res..
[44] C. Sander,et al. Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.
[45] Qinghua Hu,et al. HAlign: Fast multiple similar DNA/RNA sequence alignment based on the centre star strategy , 2015, Bioinform..
[46] L. Holm,et al. The Pfam protein families database , 2005, Nucleic Acids Res..
[47] Stephanie Boehm,et al. Applied Multivariate Techniques , 2016 .
[48] Olivier Poch,et al. BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs , 1999, Bioinform..
[49] Lode Wyns,et al. Align-m-a new algorithm for multiple alignment of highly divergent sequences , 2004, Bioinform..
[50] K. Katoh,et al. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.
[51] J. Thompson,et al. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.
[52] Robert C. Edgar,et al. MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.
[53] I. Longden,et al. EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.
[54] Yaoqi Zhou,et al. SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structures. , 2005, Bioinformatics.
[55] Jian Li,et al. Advanced computational algorithms for microbial community analysis using massive 16S rRNA sequence data , 2010, Nucleic acids research.
[56] William G. Mckendree,et al. ESPRIT: estimating species richness using large collections of 16S rRNA pyrosequences , 2009, Nucleic acids research.
[57] C. Sander,et al. Are binding residues conserved? , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.
[58] Andrew E. Torda,et al. Not assessing the efficiency of multiple sequence alignment programs , 2014, Algorithms for Molecular Biology.
[59] E. Birney,et al. Pfam: the protein families database , 2013, Nucleic Acids Res..
[60] Michalis Vazirgiannis,et al. Quality Scheme Assessment in the Clustering Process , 2000, PKDD.
[61] H O Villar,et al. Amino acid preferences at protein binding sites , 1994, FEBS letters.
[62] Burkhard Morgenstern,et al. DIALIGN: finding local similarities by multiple sequence alignment , 1998, Bioinform..
[63] Erik L L Sonnhammer,et al. Quality assessment of multiple alignment programs , 2002, FEBS letters.
[64] Guilherme Oliveira,et al. Assessing the efficiency of multiple sequence alignment programs , 2014, Algorithms for Molecular Biology.
[65] Olivier Poch,et al. BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations , 2001, Nucleic Acids Res..
[66] D. Higgins,et al. T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.
[67] Lode Wyns,et al. SABmark- a benchmark for sequence alignment that covers the entire known fold space , 2005, Bioinform..
[68] Xiaoyu Wang,et al. A large-scale benchmark study of existing algorithms for taxonomy-independent microbial community analysis , 2012, Briefings Bioinform..
[69] N. Grishin,et al. PROMALS3D: a tool for multiple protein sequence and structure alignments , 2008, Nucleic acids research.
[70] J. Thompson,et al. Issues in bioinformatics benchmarking: the case study of multiple sequence alignment , 2010, Nucleic acids research.
[71] M. A. McClure,et al. Comparative analysis of multiple protein-sequence alignment methods. , 1994, Molecular biology and evolution.
[72] Haruki Nakamura,et al. Announcing the worldwide Protein Data Bank , 2003, Nature Structural Biology.
[73] Philip Hugenholtz,et al. NAST: a multiple sequence alignment server for comparative analysis of 16S rRNA genes , 2006, Nucleic Acids Res..
[74] Yunpeng Cai,et al. ESPRIT-Tree: hierarchical clustering analysis of millions of 16S rRNA pyrosequences in quasilinear computational time , 2011, Nucleic acids research.
[75] N. Grishin,et al. Crystal structure of YbaK protein from Haemophilus influenzae (HI1434) at 1.8 Å resolution: Functional implications , 2000, Proteins.
[76] Olivier Poch,et al. BAliBASE 3.0: Latest developments of the multiple sequence alignment benchmark , 2005, Proteins.
[77] K. Katoh,et al. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.
[78] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.
[79] Subhash Sharma. Applied multivariate techniques , 1995 .
[80] David J. States,et al. Identification of protein coding regions by database similarity search , 1993, Nature Genetics.