Theoretical molecular biology: prospectives and perspectives.

I briefly discuss some aspects of theoretical molecular biology. Specifically, I include the issues of searches for homologies via string matchings, for patterns of specific nucleotide groupings and of sequence-structure relationship. The various approaches developed in order to achieve this end are described, attempting to convey some of the excitement in this quickly growing field.

[1]  A Klug,et al.  Sequence-dependent variation in the conformation of DNA. , 1981, Journal of molecular biology.

[2]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[3]  Peter H. Sellers,et al.  The Theory and Computation of Evolutionary Distances: Pattern Recognition , 1980, J. Algorithms.

[4]  Manolo Gouy,et al.  Codon catalog usage is a genome strategy modulated for gene expressivity , 1981, Nucleic Acids Res..

[5]  S. Elgin,et al.  Analysis of chromatin structure and DNA sequence organization: use of the 1,10-phenanthroline-cuprous complex. , 1982, Nucleic acids research.

[6]  A. Gibbs,et al.  The Diagram, a Method for Comparing Sequences , 1970 .

[7]  R. Tjian,et al.  The promoter-specific transcription factor Sp1 binds to upstream sequences in the SV40 early promoter , 1983, Cell.

[8]  R. Martin,et al.  Use of an 125I-labelled DNA ligand to probe DNA structure , 1983, Nature.

[9]  J. Fickett Recognition of protein coding regions in DNA sequences. , 1982, Nucleic acids research.

[10]  R Nussinov Promoter helical structure variation at the Escherichia coli polymerase interaction sites. , 1984, The Journal of biological chemistry.

[11]  R. Nussinov,et al.  Hydrophobic interactions in the major groove can influence DNA local structure. , 1986, Journal of biomolecular structure & dynamics.

[12]  James W. Fickett,et al.  Fast optimal alignment , 1984, Nucleic Acids Res..

[13]  R. Nussinov,et al.  TGTG, G clustering and other signals near non-mammalian vertebrate mRNA 3' termini: some implications. , 1986, Journal of biomolecular structure & dynamics.

[14]  A. Mclachlan,et al.  Analysis of gene duplication repeats in the myosin rod. , 1983, Journal of molecular biology.

[15]  G. M. Landau,et al.  An efficient string matching algorithm with K substitutions for nucleotide and amino acid sequences. , 1987, Journal of theoretical biology.

[16]  P. Sellers On the Theory and Computation of Evolutionary Distances , 1974 .

[17]  R. Dickerson,et al.  Kinematic model for B-DNA. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[18]  K. S. Chen,et al.  A theoretical investigation on the sequence selective binding of adriamycin to double-stranded polynucleotides. , 1986, Nucleic acids research.

[19]  W. Fiers,et al.  Preferential codon usage in prokaryotic genes: the optimal codon-anticodon interaction energy and the selective codon usage in efficiently expressed genes. , 1982, Gene.

[20]  H. Drew Structural specificities of five commonly used DNA nucleases. , 1984, Journal of molecular biology.

[21]  G. Gargiulo,et al.  Analogous cleavage of DNA by micrococcal nuclease and a 1-10-phenanthroline-cuprous complex. , 1982, Nucleic acids research.

[22]  D. Sankoff,et al.  Evolution of 5S RNA and the non-randomness of base replacement. , 1973, Nature: New biology.

[23]  H R Drew,et al.  Structure of a B-DNA dodecamer. II. Influence of base sequence on helix structure. , 1981, Journal of molecular biology.

[24]  H. Drew,et al.  DNA structural variations in the E. coli tyrT promoter , 1984, Cell.

[25]  M. I. Kanehisa,et al.  Pattern recognition in nucleic acid sequences. I. A general method for finding local homologies and symmetries , 1982, Nucleic Acids Res..

[26]  G. Fink,et al.  Repeated DNA sequences upstream from HIS1 also occur at several other co-regulated genes in Saccharomyces cerevisiae. , 1983, The Journal of biological chemistry.

[27]  S. Harvey,et al.  A molecular mechanical model to predict the helix twist angles of B-DNA. , 1984, Nucleic acids research.

[28]  Ignacio Tinoco,et al.  A dynamic programming algorithm for finding alternative RNA secondary structures , 1986, Nucleic Acids Res..

[29]  Stephen C. Harvey,et al.  Computer graphics program to reveal the dependence of the gross three- dimensional structure of the B-DNA double helix on primary structure , 1986, Nucleic Acids Res..

[30]  T. Miyata,et al.  Extraordinarily high evolutionary rate of pseudogenes: evidence for the presence of selective pressure against changes between synonymous codons. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[31]  J. Bishop,et al.  The protein products of the myc and myb oncogenes and adenovirus E1a are structurally related , 1983, Nature.

[32]  R Nussinov,et al.  Some guidelines for identification of recognition sequences: regulatory sequences frequently contain (T)GTG/CAC(A), TGA/TCA and (T)CTC/GAG(A). , 1986, Biochimica et biophysica acta.

[33]  M. Waterman,et al.  Rigorous pattern-recognition methods for DNA sequences. Analysis of promoter sequences from Escherichia coli. , 1985, Journal of molecular biology.

[34]  Rosalind C. Lee,et al.  The mouse c-abl locus: Molecular cloning and characterization , 1984, Cell.

[35]  R. Dickerson,et al.  Base sequence and helix structure variation in B and A DNA. , 1983, Journal of molecular biology.

[36]  T. Ikemura Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. , 1981, Journal of molecular biology.

[37]  Robert Tjian,et al.  Control of eukaryotic messenger RNA synthesis by sequence-specific DNA-binding proteins , 1985, Nature.

[38]  M. Vigneron,et al.  Requirement of stereospecific alignments for initiation from the simian virus 40 early promoter , 1986, Nature.

[39]  J. Richardson,et al.  Simultaneous comparison of three protein sequences. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[40]  A. Mclachlan Tests for comparing related amino-acid sequences. Cytochrome c and cytochrome c 551 . , 1971, Journal of molecular biology.

[41]  Donald E. Knuth,et al.  Fast Pattern Matching in Strings , 1977, SIAM J. Comput..

[42]  A. Mclachlan,et al.  Confidence limits for homology in protein or gene sequences. The c-myc oncogene and adenovirus E1a protein. , 1985, Journal of molecular biology.

[43]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[44]  H. Buc,et al.  Cyclic AMP receptor protein: role in transcription activation. , 1984, Science.

[45]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[46]  The predicted presence of large helical structural variation in yeast HIS4 upstream region is correlated with general amino acid control on the CYC1 gene. , 1985, Journal of biomolecular structure & dynamics.

[47]  S. McKnight,et al.  The distal transcription signals of the herpesvirus tk gene share a common hexanucleotide control sequence , 1984, Cell.

[48]  J. C. Shepherd Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[49]  R Nussinov,et al.  Eukaryotic dinucleotide preference rules and their implications for degenerate codon usage. , 1981, Journal of molecular biology.

[50]  T. Maniatis,et al.  The primary structure of rabbit β-globin mRNA as determined from cloned DNA , 1977, Cell.

[51]  D J Lipman,et al.  Contextual constraints on synonymous codon choice. , 1983, Journal of molecular biology.

[52]  M. Waterman,et al.  Pattern recognition in several sequences: consensus and alignment. , 1984, Bulletin of mathematical biology.

[53]  J. Josse,et al.  Enzymatic synthesis of deoxyribonucleic acid. VIII. Frequencies of nearest neighbor base sequences in deoxyribonucleic acid. , 1961, The Journal of biological chemistry.

[54]  R Staden,et al.  An interactive graphics program for comparing and aligning nucleic acid and amino acid sequences. , 1982, Nucleic acids research.

[55]  M. Gouy,et al.  Codon usage in bacteria: correlation with gene expressivity. , 1982, Nucleic acids research.

[56]  D. Lipman,et al.  Rapid similarity searches of nucleic acid and protein data banks. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[57]  L. Bossi,et al.  The influence of codon context on genetic code translation , 1980, Nature.

[58]  F. Sanger,et al.  Gene F of bacteriophage phiX174. Correlation of nucleotide sequences from the DNA and amino acid sequences from the gene product. , 1976, Journal of molecular biology.

[59]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[60]  L. J. Korn,et al.  Computer analysis of nucleic acid regulatory sequences. , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[61]  D. Sankoff Minimal Mutation Trees of Sequences , 1975 .

[62]  R Nussinov,et al.  Large helical conformational deviations from ideal B-DNA and prokaryotic regulatory sites. , 1985, Journal of theoretical biology.

[63]  C R Calladine,et al.  Mechanics of sequence-dependent stacking of bases in B-DNA. , 1982, Journal of molecular biology.

[64]  E. Trifonov,et al.  Sequence-dependent deformational anisotropy of chromatin DNA. , 1980, Nucleic acids research.

[65]  R Nussinov,et al.  An efficient code searching for sequence homology and DNA duplication. , 1983, Journal of theoretical biology.

[66]  S. Harvey,et al.  Base sequence, local helix structure, and macroscopic curvature of A-DNA and B-DNA. , 1986, The Journal of biological chemistry.

[67]  E. Trifonov,et al.  The pitch of chromatin DNA is reflected in its nucleotide sequence. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[68]  Jerrold R. Griggs,et al.  Algorithms for Loop Matchings , 1978 .

[69]  Ruth Nussinov,et al.  An accelerated algorithm for calculating the secondary structure of single stranded RNAs , 1984, Nucleic Acids Res..

[70]  R. Nussinov,et al.  Fast algorithm for predicting the secondary structure of single-stranded RNA. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[71]  A. Mclachlan,et al.  Repeating sequences and gene duplication in proteins. , 1972, Journal of molecular biology.

[72]  M. Gouy,et al.  Codon frequencies in 119 individual genes confirm consistent choices of degenerate bases according to genome type. , 1980, Nucleic acids research.

[73]  W. Fitch An improved method of testing for evolutionary homology. , 1966, Journal of molecular biology.

[74]  T. Miyata,et al.  Secondary structure of MS2 phage RNA and bias in code word usage. , 1979, Nucleic acids research.

[75]  R. Britten,et al.  Rates of DNA sequence evolution differ between taxonomic groups. , 1986, Science.

[76]  I. Tinoco,et al.  Estimation of Secondary Structure in Ribonucleic Acids , 1971, Nature.

[77]  R Nussinov,et al.  Some rules in the ordering of nucleotides in the DNA. , 1980, Nucleic acids research.

[78]  R. Tjian,et al.  Essential contact residues within SV40 large T antigen binding sites I and II identified by alkylation-interference , 1984, Cell.

[79]  Robert S. Boyer,et al.  A fast string searching algorithm , 1977, CACM.

[80]  D Sankoff,et al.  Matching sequences under deletion-insertion constraints. , 1972, Proceedings of the National Academy of Sciences of the United States of America.

[81]  R. Nussinov Sequence signals which may be required for efficient formation of mRNA 3' termini. , 1986, Nucleic acids research.

[82]  D. Baltimore,et al.  Double-stranded cleavage by cell extracts near recombinational signal sequences of immunoglobulin genes , 1984, Nature.

[83]  Esko Ukkonen,et al.  On Approximate String Matching , 1983, FCT.

[84]  H. M. Martinez,et al.  An RNA folding rule , 1984, Nucleic Acids Res..

[85]  Gad M. Landau,et al.  An efficient string matching algorithm with k differences for nucleotide and amino acid sequences , 2018, Nucleic Acids Res..

[86]  M. Ptashne,et al.  Cooperative binding of λ repressors to sites separated by integral turns of the DNA helix , 1986, Cell.

[87]  J. P. Dumas,et al.  Efficient algorithms for folding and comparing nucleic acid sequences , 1982, Nucleic Acids Res..

[88]  J. Maizel,et al.  Enhanced graphic matrix analysis of nucleic acid and protein sequences. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[89]  H R Drew,et al.  DNA bending and its relation to nucleosome positioning. , 1985, Journal of molecular biology.

[90]  R Nussinov,et al.  Doublet frequencies in evolutionary distinct groups. , 1984, Nucleic acids research.

[91]  T. D. Schneider,et al.  Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli. , 1982, Nucleic acids research.