BLMT: statistical sequence analysis using N-grams.
暂无分享,去创建一个
Judith Klein-Seetharaman | Madhavi Ganapathiraju | Vijayalaxmi Manoharan | J. Klein-Seetharaman | M. Ganapathiraju | Vijayalaxmi Manoharan
[1] Lorna J. Smith,et al. Long-Range Interactions Within a Nonnative Protein , 2002, Science.
[2] Tetsuo Shibuya,et al. Indexing huge genome sequences for solving various problems. , 2001, Genome informatics. International Conference on Genome Informatics.
[3] Kuo-Chen Chou,et al. Prediction of protein secondary structure content by artificial neural network , 2003, J. Comput. Chem..
[4] Yael Mandel-Gutfreund,et al. On the significance of alternating patterns of polar and non-polar residues in beta-strands. , 2002, Journal of molecular biology.
[5] P. Y. Chou,et al. Prediction of the secondary structure of proteins from their amino acid sequence. , 2006 .
[6] David Haussler,et al. Classifying G-protein coupled receptors with support vector machines , 2002, Bioinform..
[7] D Larhammar,et al. Lack of biological significance in the 'linguistic features' of noncoding DNA--a quantitative analysis. , 1996, Nucleic acids research.
[8] Chan,et al. Can Zipf distinguish language from noise in noncoding DNA? , 1996, Physical review letters.
[9] Partha Niyogi,et al. A Note on Zipf's Law, Natural Languages, and Noncoding DNA regions , 1995, ArXiv.
[10] Eugene W. Myers,et al. Suffix arrays: a new method for on-line string searches , 1993, SODA '90.
[11] B. Rost,et al. State-of-the-art in membrane protein prediction. , 2002, Applied bioinformatics.
[12] J. Klein-Seetharaman,et al. Yule Value Tables from Protein Datasets , 2004 .
[13] Gad M. Landau,et al. Sequence complexity profiles of prokaryotic genomic sequences: A fast algorithm for calculating linguistic complexity , 2002, Bioinform..
[14] R. Durbin,et al. Enhanced protein domain discovery by using language modeling techniques from speech recognition , 2003, Proceedings of the National Academy of Sciences of the United States of America.
[15] Judith Klein-Seetharaman,et al. Identification of fundamental building blocks in protein sequences using statistical association measures , 2004, SAC '04.
[16] K. Chou,et al. Prediction of protein secondary structure content. , 1999, Protein engineering.
[17] J. Richardson,et al. Amino acid preferences for specific locations at the ends of alpha helices. , 1988, Science.
[18] S Erhan,et al. Amino-acid neighborhood relationships in proteins. Breakdown of amino-acid sequences into overlapping doublets, triplets and quadruplets. , 1980, International journal of bio-medical computing.
[19] T G Dewey,et al. The Shannon information entropy of protein sequences. , 1996, Biophysical journal.
[20] N. Balakrishnan,et al. Characterization of protein secondary structure , 2004, IEEE Signal Processing Magazine.
[21] P. Holland,et al. Discrete Multivariate Analysis. , 1976 .
[22] Wentian Li,et al. Statistical Properties of Open Reading Frames in Complete Genome Sequences , 1999, Comput. Chem..
[23] Andreas D. Baxevanis,et al. Bioinformatics - a practical guide to the analysis of genes and proteins , 2001, Methods of biochemical analysis.
[24] Jaime G. Carbonell,et al. Comparative N-gram Analysis of Genome Sequences , 2001 .
[25] Per Jambeck,et al. Developing Bioinformatics Computer Skills , 2001 .
[26] H E Stanley,et al. Linguistic features of noncoding DNA sequences. , 1994, Physical review letters.
[27] Richard Bonneau,et al. Ab initio protein structure prediction: progress and prospects. , 2001, Annual review of biophysics and biomolecular structure.
[28] Cathy H. Wu,et al. Protein classification artificial neural system , 1992, Protein science : a publication of the Protein Society.
[29] A A Tsonis,et al. Is DNA a language? , 1997, Journal of theoretical biology.
[30] L. Wasserman,et al. Exponential Language Models, Logistic Regression, and Semantic Coherence , 2000 .
[31] Stanley F. Chen,et al. An empirical study of smoothing techniques for language modeling , 1999 .
[32] S F Altschul,et al. Statistical methods and insights for protein and DNA sequences. , 1991, Annual review of biophysics and biophysical chemistry.
[33] S Karlin,et al. Trinucleotide repeats and long homopeptides in genes and proteins associated with nervous system disease and development. , 1996, Proceedings of the National Academy of Sciences of the United States of America.
[34] S. Karlin,et al. Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.
[35] S Rackovsky,et al. On the properties and sequence context of structurally ambivalent fragments in proteins , 2003, Protein science : a publication of the Protein Society.
[36] E. Trifonov,et al. Enhancement of the nucleosomal pattern in sequences of lower complexity. , 1997, Nucleic acids research.
[37] S. Salzberg,et al. Alignment of whole genomes. , 1999, Nucleic acids research.
[38] Golan Yona,et al. Variations on probabilistic suffix trees: statistical modeling and prediction of protein families , 2001, Bioinform..
[39] Bogdan Dorohonceanu,et al. Accelerating Protein Classification Using Suffix Trees , 2000, ISMB.
[40] Judith Klein-Seetharaman,et al. PROTEINS: Structure, Function, and Bioinformatics 58:955–970 (2005) Protein Classification Based on Text Document Classification Techniques , 2022 .
[41] E. B. Newman,et al. Tests of a statistical explanation of the rank-frequency relation for words in written English. , 1958, American Journal of Psychology.
[42] Jonathan Pevsner,et al. Basic Local Alignment Search Tool (BLAST) , 2005 .
[43] Sean R. Eddy,et al. Biological sequence analysis: Probabilistic approaches to phylogeny , 1998 .
[44] A K Konopka,et al. Noncoding DNA, Zipf's law, and language. , 1995, Science.
[45] S. Karlin,et al. Quantile distributions of amino acid usage in protein classes. , 1992, Protein engineering.
[46] D. Searls,et al. Robots in invertebrate neuroscience , 2002, Nature.