Alignment-free sequence comparison-a review
暂无分享,去创建一个
[1] W. Pearson. Rapid and sensitive sequence comparison with FASTP and FASTA. , 1990, Methods in enzymology.
[2] Marin van Heel,et al. A new family of powerful multivariate statistical sequence analysis techniques. , 1991 .
[3] M. P. Cummings. PHYLIP (Phylogeny Inference Package) , 2004 .
[4] R. Durbin,et al. Biological sequence analysis: Background on probability , 1998 .
[5] J. Wootton. Introduction to computational biology: Maps, sequences and genomes; Interdisciplinary statistics , 1997 .
[6] T. Gisiger. Scale invariance in biology: coincidence or footprint of a universal mechanism? , 2001, Biological reviews of the Cambridge Philosophical Society.
[7] David Siegmund,et al. Approximate P-Values for Local Sequence Alignments: Numerical Studies , 2001, J. Comput. Biol..
[8] Winston A Hide,et al. A comprehensive approach to clustering of expressed human gene sequence: the sequence tag alignment and consensus knowledge base. , 1999, Genome research.
[9] D. Davison,et al. A measure of DNA sequence dissimilarity based on Mahalanobis distance between frequencies of words. , 1997, Biometrics.
[10] D. Davison,et al. d2_cluster: a validated method for clustering EST and full-length cDNAsequences. , 1999, Genome research.
[11] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 1997, Texts in Computer Science.
[12] D B Davison,et al. Alternative gene form discovery and candidate gene selection from gene indexing projects. , 1998, Genome research.
[13] Vladimir V. V'yugin,et al. Algorithmic Complexity and Stochastic Properties of Finite Binary Sequences , 1999, Comput. J..
[14] Michael S. Waterman,et al. Introduction to Computational Biology: Maps, Sequences and Genomes , 1998 .
[15] Audra E. Kosh,et al. Linear Algebra and its Applications , 1992 .
[16] Solomon Kullback,et al. Information Theory and Statistics , 1960 .
[17] D. B. Searls,et al. Reading the book of life , 2001, Bioinform..
[18] Thomas L. Madden,et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.
[19] T K Attwood. Genomics. The Babel of bioinformatics. , 2000, Science.
[20] N. Saitou,et al. The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.
[21] M. O. Dayhoff,et al. 22 A Model of Evolutionary Change in Proteins , 1978 .
[22] M. O. Dayhoff. A model of evolutionary change in protein , 1978 .
[23] Elizabeth R. Jessup,et al. Matrices, Vector Spaces, and Information Retrieval , 1999, SIAM Rev..
[24] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.
[25] R. Mullin,et al. The distribution of the frequency of occurrence of nucleotide subsequences, based on their overlap capability. , 1989, Biometrics.
[26] Winston Hide,et al. Biological Evaluation of d2, an Algorithm for High-Performance Sequence Comparison , 1994, J. Comput. Biol..
[27] Jonas S. Almeida,et al. Universal sequence map (USM) of arbitrary discrete sequences , 2002, BMC Bioinformatics.
[28] T Reichhardt,et al. It's sink or swim as a tidal wave of data approaches , 1999, Nature.
[29] Pavel A. Pevzner,et al. Statistical distance between texts and filtration methods in sequence comparison , 1992, Comput. Appl. Biosci..
[30] O. Gotoh. An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.
[31] H. J. Jeffrey. Chaos game representation of gene structure. , 1990, Nucleic acids research.
[32] Alan Christoffels,et al. A Novel Approach Towards a Comprehensive Consensus Representation of the Expressed Human Genome , 1997 .
[33] Pasquale Petrilli. Classification of protein sequences by their dipeptide composition , 1993, Comput. Appl. Biosci..
[34] James Ze Wang,et al. SST: an algorithm for finding near-exact sequence matches in time proportional to the logarithm of the database size , 2002, Bioinform..
[35] Christian Gautier,et al. Statistical method for predicting protein coding regions in nucleic acid sequences , 1987, Comput. Appl. Biosci..
[36] A A Zharkikh,et al. Statistical analysis of L-tuple frequencies in eubacteria and organelles. , 1993, Bio Systems.
[37] Dan Gusfield,et al. Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .
[38] Robert Miller,et al. STACK: Sequence Tag Alignment and Consensus Knowledgebase , 2001, Nucleic Acids Res..
[39] S. Henikoff,et al. Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.
[40] Mireille Régnier,et al. A unified approach to word statistics , 1998, RECOMB '98.
[41] T. Lundstedt,et al. Classification of G‐protein coupled receptors by alignment‐independent extraction of principal chemical properties of primary amino acid sequences , 2002, Protein science : a publication of the Protein Society.
[42] W. Stemmer,et al. Genome shuffling leads to rapid phenotypic improvement in bacteria , 2002, Nature.
[43] S. B. Needleman,et al. A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.
[44] P Petrilli,et al. PFDB: A Protein Families DataBase for Macintosh Computers. The Effectiveness of Its Organization in Searching for Protein Similarity , 1997, Journal of protein chemistry.
[45] N N Alexandrov,et al. Statistical method for rapid homology search. , 1988, Nucleic acids research.
[46] John E. Carpenter,et al. Assessment of the parallelization approach of d2_cluster for high‐performance sequence clustering , 2002, J. Comput. Chem..
[47] Jonas S. Almeida,et al. Analysis of genomic sequences by Chaos Game Representation , 2001, Bioinform..
[48] Brian Everitt,et al. Cluster analysis , 1974 .
[49] Daniel B. Davison,et al. Brute force estimation of the number of human genes using EST clustering as a measure , 2001, IBM J. Res. Dev..
[50] D. Lipman,et al. Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.
[51] H Moereels,et al. Classification and identification of proteins by means of common and specific amino acid n-tuples in unaligned sequences. , 1998, Computer methods and programs in biomedicine.
[52] Robert B. Ash,et al. Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.
[53] A. J. Gibbs,et al. The Transition Matrix Method for Comparing Sequences; Its use in Describing and Classifying Proteins by their Amino Acid Sequences , 1971 .
[54] Tiee-Jian Wu,et al. Statistical Measures of DNA Sequence Dissimilarity under Markov Chain Models of Base Composition , 2001, Biometrics.
[55] Steve Baker,et al. Integrated gene and species phylogenies from unaligned whole genome protein sequences , 2002, Bioinform..
[56] Xin Chen,et al. An information-based sequence distance and its application to whole mitochondrial genome phylogeny , 2001, Bioinform..
[57] Rainer Fuchs. From Sequence to Biology: The Impact on Bioinformatics , 2002, Bioinform..
[58] James R. Schott,et al. Matrix Analysis for Statistics , 2005 .
[59] Benjamin Yakir,et al. Approximate p-values for local sequence alignments , 2000 .
[60] S. Henikoff,et al. Amino acid substitution matrices. , 2000, Advances in protein chemistry.
[61] J. Leader,et al. A comprehensive vertebrate phylogeny using vector representations of protein sequences from whole genomes. , 2002, Molecular biology and evolution.
[62] Victor V. Solovyev,et al. A novel method of protein sequence classification based on oligopeptide frequency analysis and its application to search for functional sites and to domain localization , 1993, Comput. Appl. Biosci..
[63] Gesine Reinert,et al. Probabilistic and Statistical Properties of Words: An Overview , 2000, J. Comput. Biol..
[64] E. Myers,et al. Basic local alignment search tool. , 1990, Journal of molecular biology.
[65] E V Koonin. The emerging paradigm and open problems in comparative genomics. , 1999, Bioinformatics.
[66] M. Lynch. Intron evolution as a population-genetic process , 2002, Proceedings of the National Academy of Sciences of the United States of America.
[67] Dónall A. Mac Dónaill,et al. Representation of amino acids as five-bit or three-bit patterns for filtering protein databases , 2001, Bioinform..
[68] H. J. Jeffrey. Chaos game representation of gene structure. , 1990, Nucleic acids research.
[69] A A Zharkikh,et al. Quick assessment of similarity of two sequences by comparison of their L-tuple frequencies. , 1993, Bio Systems.
[70] Teresa K. Attwood,et al. The Babel of Bioinformatics , 2000, Science.
[71] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[72] Xin Chen,et al. A compression algorithm for DNA sequences and its applications in genome comparison , 2000, RECOMB '00.
[73] B. Blaisdell. A measure of the similarity of sets of sequences not requiring sequence alignment. , 1986, Proceedings of the National Academy of Sciences of the United States of America.
[74] William R. Pearson. Protein sequence comparison and protein evolution , 1995, ISMB 1995.