Comparative ngram analysis of whole-genome sequences
暂无分享,去创建一个
Jaime G. Carbonell | Judith Klein-Seetharaman | Madhavi K. Ganapathiraju | Roni Rosenfeld | J. Carbonell | J. Klein-Seetharaman | M. Ganapathiraju | Raj Reddy | D. Weisser | D. Weisser | Raj Reddy | Roni Rosenfeld
[1] S. Karlin,et al. Quantile distributions of amino acid usage in protein classes. , 1992, Protein engineering.
[2] S Erhan,et al. Amino-acid neighborhood relationships in proteins. Breakdown of amino-acid sequences into overlapping doublets, triplets and quadruplets. , 1980, International journal of bio-medical computing.
[3] Hiroki Arimura,et al. Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications , 2001, CPM.
[4] A A Tsonis,et al. Is DNA a language? , 1997, Journal of theoretical biology.
[5] H E Stanley,et al. Linguistic features of noncoding DNA sequences. , 1994, Physical review letters.
[6] H Herzel,et al. Information content of protein sequences. , 2000, Journal of theoretical biology.
[7] Martin Vingron,et al. q-gram based database searching using a suffix array (QUASAR) , 1999, RECOMB.
[8] Chan,et al. Can Zipf distinguish language from noise in noncoding DNA? , 1996, Physical review letters.
[9] Eugene W. Myers,et al. Suffix arrays: a new method for on-line string searches , 1993, SODA '90.
[10] B. Berger,et al. betawrap: Successful prediction of parallel β-helices from primary sequence reveals an association with many microbial pathogens , 2001, Proceedings of the National Academy of Sciences of the United States of America.
[11] S F Altschul,et al. Statistical methods and insights for protein and DNA sequences. , 1991, Annual review of biophysics and biophysical chemistry.
[12] S. Karlin,et al. Over- and under-representation of short oligonucleotides in DNA sequences. , 1992, Proceedings of the National Academy of Sciences of the United States of America.
[13] N. Jesper Larsson. Extended application of suffix trees to data compression , 1996, Proceedings of Data Compression Conference - DCC '96.
[14] Timothy B. Stockwell,et al. The Sequence of the Human Genome , 2001, Science.
[15] Peter Weiner,et al. Linear Pattern Matching Algorithms , 1973, SWAT.
[16] Wentian Li,et al. Statistical Properties of Open Reading Frames in Complete Genome Sequences , 1999, Comput. Chem..
[17] D Larhammar,et al. Lack of biological significance in the 'linguistic features' of noncoding DNA--a quantitative analysis. , 1996, Nucleic acids research.
[18] J. V. Moran,et al. Initial sequencing and analysis of the human genome. , 2001, Nature.
[19] Stanley,et al. Correlations in binary sequences and a generalized Zipf analysis. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.
[20] Ronald Rosenfeld,et al. Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.