The method of N-grams in large-scale clustering of DNA texts
暂无分享,去创建一个
Zeev Volkovich | Eviatar Nevo | Abraham B. Korol | Alexander Bolshoy | Valery M. Kirzhner | E. Nevo | A. Bolshoy | A. Korol | V. Kirzhner | Z. Volkovich
[1] M Damashek,et al. Gauging Similarity with n-Grams: Language-Independent Categorization of Text , 1995, Science.
[2] S Karlin,et al. Heterogeneity of genomes: measures and values. , 1994, Proceedings of the National Academy of Sciences of the United States of America.
[3] Jeong Soo Ahn,et al. Using n-grams for Korean text retrieval , 1996, SIGIR '96.
[4] Peter Willett,et al. Searching for historical word-forms in a database of 17th-century English text using spelling-correction methods , 1992, SIGIR '92.
[5] E. Forgy. Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .
[6] Robert R. Sokal,et al. A statistical method for evaluating systematic relationships , 1958 .
[7] Stephen Huffman. Acquaintance: Language-Independent Document Categorization by N-Grams , 1995, TREC.
[8] Alan M. Frieze,et al. Optimal Reconstruction of a Sequence from its Probes , 1999, J. Comput. Biol..
[9] William M. Rand,et al. Objective Criteria for the Evaluation of Clustering Methods , 1971 .
[10] C. L. Mallows,et al. A Method for Comparing Two Hierarchical Clusterings: Rejoinder , 1983 .
[11] Longin Jan Latecki,et al. Tree-structured partitioning based on splitting histograms of distances , 2003, Third IEEE International Conference on Data Mining.
[12] Joachim M. Buhmann,et al. A Resampling Approach to Cluster Validation , 2002, COMPSTAT.
[13] W. B. Cavnar,et al. Using An N-Gram-Based Document Representation With A Vector Processing Retrieval Model , 1994, TREC.
[14] E N Trifonov,et al. Linguistic measure of taxonomic and functional relatedness of nucleotide sequences. , 1990, Journal of biomolecular structure & dynamics.
[15] R. Huber,et al. The complete genome of the hyperthermophilic bacterium Aquifex aeolicus , 1998, Nature.
[16] Michael W. Berry,et al. Understanding search engines: mathematical modeling and text retrieval (software , 1999 .
[17] Zeev Volkovich,et al. Text mining with information-theoretic clustering , 2003, Comput. Sci. Eng..
[18] C. E. SHANNON,et al. A mathematical theory of communication , 1948, MOCO.
[19] Anil K. Jain,et al. Algorithms for Clustering Data , 1988 .
[20] Franco P. Preparata,et al. Sequencing by hybridization using direct and reverse cooperating spectra , 2002, RECOMB '02.
[21] Jonathan D. Cohen. Highlights: language- and domain-independent automatic indexing terms for abstracting , 1995 .
[22] Peter J. Rousseeuw,et al. Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .
[23] Franco P. Preparata,et al. Sequencing-by-hybridization revisited: the analog-spectrum proposal , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.
[24] T. de Heer. Experiments with syntactic traces in information retrieval , 1974, Inf. Storage Retr..
[25] S. Karlin,et al. Dinucleotide relative abundance extremes: a genomic signature. , 1995, Trends in genetics : TIG.
[26] E. Nevo,et al. A Large-Scale Comparison of Genomic Sequences: One Promising Approach , 2003, Acta Biotheoretica.
[27] Stephen Huffman,et al. Acquaintance: A Novel Vector-Space N-Gram Technique for Document Categorization , 1994, TREC.
[28] E. Nevo,et al. Compositional spectrum—revealing patterns for genomic sequence characterization and comparison , 2002 .
[29] Alexander Bolshoy,et al. DNA sequence analysis linguistic tools: contrast vocabularies, compositional spectra and linguistic complexity. , 2003, Applied bioinformatics.
[30] Elizabeth S. Adams,et al. Trigrams as index element in full text retrieval: observations and experimental results , 1993, CSC '93.
[31] C. Mallows,et al. A Method for Comparing Two Hierarchical Clusterings , 1983 .