Bacterial genomes lacking long-range correlations may not be modeled by low-order Markov chains: The role of mixing statistics and frame shift of neighboring genes
暂无分享,去创建一个
Germinal Cocho | Pedro Miramontes | Wentian Li | Ricardo Mansilla | Wentian Li | G. Cocho | P. Miramontes | R. Mansilla
[1] E. Birney,et al. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.
[2] Wentian Li,et al. Universal 1/f noise, crossovers of scaling exponents, and chromosome-specific patterns of guanine-cytosine content in DNA sequences of the human genome. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.
[3] Junwen Wang,et al. Generalizations of Markov model to characterize biological sequences , 2005, BMC Bioinformatics.
[4] J. Lobry. Asymmetric substitution patterns in the two DNA strands of bacteria. , 1996, Molecular biology and evolution.
[5] K. Dill,et al. A maximum entropy framework for nonexponential distributions , 2013, Proceedings of the National Academy of Sciences.
[6] Gene-Wei Li,et al. The anti-Shine-Dalgarno sequence drives translational pausing and codon choice in bacteria , 2012, Nature.
[7] Arend Hintze,et al. Scaling metagenome sequence assembly with probabilistic de Bruijn graphs , 2011, Proceedings of the National Academy of Sciences.
[8] F. De Amicis,et al. Intercodon dinucleotides affect codon choice in plant genes. , 2000, Nucleic acids research.
[9] D. Haussler,et al. A hidden Markov model that finds genes in E. coli DNA. , 1994, Nucleic acids research.
[10] Mark Borodovsky,et al. Deriving Non-homogeneous DNA Markov Chain Models by Cluster Analysis Algorithm Minimizing Multiple Alignment Entropy , 1994, Comput. Chem..
[11] C. Fuchs. On the distribution of the nucleotides in seven completely sequenced DNAs. , 1980, Gene.
[12] Antonio Marín,et al. Preference for guanosine at first codon position in highly expressed Escherichia coli genes. A relationship with translational efficiency , 1996, Nucleic Acids Res..
[13] Information decomposition of symbolic sequences , 2003, math/0302195.
[14] Leandro Pardo,et al. Testing the Order of Markov Dependence in DNA Sequences , 2011 .
[15] Daniel A. Henderson,et al. Fitting Markov chain models to discrete state series such as DNA sequences , 1999 .
[16] Françoise Argoul,et al. Multi-scale coding of genomic information: From DNA sequence to genome structure and function , 2011 .
[17] S Karlin,et al. Patchiness and correlations in DNA sequences , 1993, Science.
[18] Uwe Hassler,et al. Nonsensical and biased correlation due to pooling heterogeneous samples , 2003 .
[19] Jan Komorowski,et al. Nucleosomes are well positioned in exons and carry characteristic histone modifications. , 2009, Genome research.
[20] Simon Tavaré,et al. Codon preference and primary sequence structure in protein-coding regions , 1989 .
[21] Sean R. Eddy,et al. Biological sequence analysis: Preface , 1998 .
[22] D. Vere-Jones. Markov Chains , 1972, Nature.
[23] Latent Periodicity of Protein Sequences , 1999 .
[24] V. Tumanyan,et al. Coexistence of different base periodicities in prokaryotic genomes as related to DNA curvature, supercoiling, and transcription. , 2011, Genomics.
[25] Wentian Li. Mutual information functions versus correlation functions , 1990 .
[26] Wentian Li,et al. Three lectures on case-control genetic association analysis , 2007, Briefings Bioinform..
[27] Frank H. Eeckman,et al. Principal Component Analysis and Large-Scale Correlations in Non-Coding Sequences of Human DNA , 1996, J. Comput. Biol..
[28] J. Sánchez,et al. Analysis of bilateral inverse symmetry in whole bacterial chromosomes. , 2002, Biochemical and biophysical research communications.
[29] Bilal Salih,et al. Visible periodicity of strong nucleosome DNA sequences , 2015, Journal of biomolecular structure & dynamics.
[30] G Bernardi,et al. Compositional heterogeneity within and among isochores in mammalian genomes. I. CsCl and sequence analyses. , 2001, Gene.
[31] Hanspeter Herzel,et al. Correlations in DNA sequences: The role of protein coding segments , 1997 .
[32] Sergey V. Buldyrev,et al. Power Law Correlations in DNA Sequences , 2013 .
[33] Peter Avery,et al. Fitting interconnected Markov chain models—DNA sequences and test cricket matches , 2002 .
[34] B. Blaisdell,et al. Markov chain analysis finds a significant influence of neighboring bases on the occurrence of a base in eucaryotic nuclear DNA sequences both protein-coding and noncoding , 1985, Journal of Molecular Evolution.
[35] Astero Provata,et al. Complexity measures for the evolutionary categorization of organisms , 2014, Comput. Biol. Chem..
[36] Wentian Li,et al. The Study of Correlation Structures of DNA Sequences: A Critical Review , 1997, Comput. Chem..
[37] Wentian Li. The Measure of Compositional Heterogeneity in DNA Sequences Is Related to Measures of Complexity , 1997, adap-org/9709007.
[38] Yechezkel Kashi,et al. Three Sequence Rules for Chromatin , 2006, Journal of biomolecular structure & dynamics.
[39] E N Trifonov,et al. Sequence Structure of Hidden 10.4-base Repeat in the Nucleosomes of C. elegans , 2008, Journal of biomolecular structure & dynamics.
[40] Wentian Li,et al. Periodic Distribution of a Putative Nucleosome Positioning Motif in Human, Nonhuman Primates, and Archaea: Mutual Information Analysis , 2013, International journal of genomics.
[41] J. Shine,et al. The 3'-terminal sequence of Escherichia coli 16S ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. , 1974, Proceedings of the National Academy of Sciences of the United States of America.
[42] Daniel Segrè,et al. Chromosomal periodicity of evolutionarily conserved gene pairs , 2007, Proceedings of the National Academy of Sciences.
[43] V. Yampol’skii,et al. Binary N-step Markov chains and long-range correlated systems. , 2003, Physical review letters.
[44] Pavel A Pevzner,et al. How to apply de Bruijn graphs to genome assembly. , 2011, Nature biotechnology.
[45] Wentian Li. The complexity of DNA , 1997 .
[46] A. Raftery,et al. Estimation and Modelling Repeated Patterns in High Order Markov Chains with the Mixture Transition Distribution Model , 1994 .
[47] T. Haran,et al. The coexistence of the nucleosome positioning code with the genetic code on eukaryotic genomes , 2009, Nucleic acids research.
[48] E. Trifonov,et al. The pitch of chromatin DNA is reflected in its nucleotide sequence. , 1980, Proceedings of the National Academy of Sciences of the United States of America.
[49] P. Bernaola-Galván,et al. Compositional segmentation and long-range fractal correlations in DNA sequences. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.
[50] Durbin,et al. Biological Sequence Analysis , 1998 .
[51] Dimitris Kugiumtzis,et al. Investigating long range correlation in DNA sequences using significance tests of conditional mutual information , 2014, Comput. Biol. Chem..
[52] David Haussler,et al. A Generalized Hidden Markov Model for the Recognition of Human Genes in DNA , 1996, ISMB.
[53] Joaquín Sánchez. Sequences encoding identical peptides for the analysis and manipulation of coding DNA , 2013, Bioinformation.
[54] Wentian Li,et al. Long-range correlation and partial 1/fα spectrum in a noncoding DNA sequence , 1992 .
[55] Jan Beran,et al. Statistics for long-memory processes , 1994 .
[56] Hanspeter Herzel,et al. 10-11 bp periodicities in complete genomes reflect protein structure and DNA folding , 1999, Bioinform..
[57] Gill Bejerano. Algorithms for variable length Markov chain modeling , 2004, Bioinform..
[58] Tetsuya Hayashi,et al. Complete Genome Sequence and Comparative Genome Analysis of Enteropathogenic Escherichia coli O127:H6 Strain E2348/69 , 2008, Journal of bacteriology.
[59] S. Franz,et al. Critical Phenomena in Natural Sciences: Chaos, Fractals, Selforganization and Disorder: Concepts and Tools , 2004 .
[60] Eugene V. Korotkov,et al. Latent sequence periodicity of some oncogenes and DNA-binding protein genes , 1997, Comput. Appl. Biosci..
[61] David Mary Rajathei,et al. Analysis of sequence repeats of proteins in the PDB , 2013, Comput. Biol. Chem..
[62] E. Trifonov. 3-, 10.5-, 200- and 400-base periodicities in genome sequences , 1998 .
[63] W Li,et al. Compositional heterogeneity within, and uniformity between, DNA sequences of yeast chromosomes. , 1998, Genome research.
[64] Jan Beran,et al. Long-Memory Processes: Probabilistic Properties and Statistical Methods , 2013 .
[65] S. Salzberg,et al. Microbial gene identification using interpolated Markov models. , 1998, Nucleic acids research.
[66] Lenwood S. Heath,et al. Genomic Signatures in De Bruijn Chains , 2007, WABI.
[67] I. Grosse,et al. MEASURING CORRELATIONS IN SYMBOL SEQUENCES , 1995 .
[68] P. Pevzner,et al. An Eulerian path approach to DNA fragment assembly , 2001, Proceedings of the National Academy of Sciences of the United States of America.
[69] Liisa Holm,et al. Rapid automatic detection and alignment of repeats in protein sequences , 2000, Proteins.
[70] P Bernaola-Galván,et al. Study of statistical correlations in DNA sequences. , 2002, Gene.
[71] P W Garden,et al. Markov analysis of viral DNA/RNA sequences. , 1980, Journal of theoretical biology.
[72] A. Cuticchia,et al. Influence of intercodon and base frequencies on codon usage in filarial parasites. , 2001, Genomics.
[73] G Bernardi,et al. Compositional heterogeneity within and among isochores in mammalian genomes. II. Some general comments. , 2001, Gene.
[74] Amrita Pati. Graph-based genomic signatures , 2008 .
[75] R. Voss,et al. Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. , 1992, Physical review letters.
[76] D. Eisenberg,et al. A census of protein repeats. , 1999, Journal of molecular biology.
[77] A MARKOV MODEL FOR PROTEIN SEQUENCES , 2006 .
[78] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.
[79] Nikolai A. Kudryashov,et al. Information decomposition method to analyze symbolical sequences , 2003 .