Comparative n-gram analysis of whole-genome protein sequences
暂无分享,去创建一个
Jaime G. Carbonell | Judith Klein-Seetharaman | Madhavi K. Ganapathiraju | Ramana G. Reddy | D. Weisser | R. Rosenfeld | J. Carbonell | J. Klein-Seetharaman | M. Ganapathiraju | Raj Reddy | D. Weisser | Roni Rosenfeld
[1] Stanley,et al. Correlations in binary sequences and a generalized Zipf analysis. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.
[2] S. Karlin,et al. Over- and under-representation of short oligonucleotides in DNA sequences. , 1992, Proceedings of the National Academy of Sciences of the United States of America.
[3] J. V. Moran,et al. Initial sequencing and analysis of the human genome. , 2001, Nature.
[4] Wentian Li,et al. Statistical Properties of Open Reading Frames in Complete Genome Sequences , 1999, Comput. Chem..
[5] Roberto Grossi,et al. Compressed suffix arrays and suffix trees with applications to text indexing and string matching (extended abstract) , 2000, STOC '00.
[6] Chan,et al. Can Zipf distinguish language from noise in noncoding DNA? , 1996, Physical review letters.
[7] D Larhammar,et al. Lack of biological significance in the 'linguistic features' of noncoding DNA--a quantitative analysis. , 1996, Nucleic acids research.
[8] D. Mccormick. Sequence the Human Genome , 1986, Bio/Technology.
[9] B. Berger,et al. betawrap: Successful prediction of parallel β-helices from primary sequence reveals an association with many microbial pathogens , 2001, Proceedings of the National Academy of Sciences of the United States of America.
[10] S Erhan,et al. Amino-acid neighborhood relationships in proteins. Breakdown of amino-acid sequences into overlapping doublets, triplets and quadruplets. , 1980, International journal of bio-medical computing.
[11] A A Tsonis,et al. Is DNA a language? , 1997, Journal of theoretical biology.
[12] H E Stanley,et al. Linguistic features of noncoding DNA sequences. , 1994, Physical review letters.
[13] N. Jesper Larsson. Extended application of suffix trees to data compression , 1996, Proceedings of Data Compression Conference - DCC '96.
[14] Ronald Rosenfeld,et al. Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.
[15] S. Karlin,et al. Quantile distributions of amino acid usage in protein classes. , 1992, Protein engineering.
[16] H Herzel,et al. Information content of protein sequences. , 2000, Journal of theoretical biology.
[17] S F Altschul,et al. Statistical methods and insights for protein and DNA sequences. , 1991, Annual review of biophysics and biophysical chemistry.
[18] Eugene W. Myers,et al. Suffix arrays: a new method for on-line string searches , 1993, SODA '90.
[19] Alberto Apostolico,et al. The Myriad Virtues of Subword Trees , 1985 .
[20] A K Konopka,et al. Noncoding DNA, Zipf's law, and language. , 1995, Science.
[21] Peter Weiner,et al. Linear Pattern Matching Algorithms , 1973, SWAT.
[22] Martin Vingron,et al. q-gram based database searching using a suffix array (QUASAR) , 1999, RECOMB.
[23] Hiroki Arimura,et al. Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications , 2001, CPM.