A grammar describing 'biological binding operators' to model gene regulation.

The study of the mechanisms involved in the regulation of protein synthesis has become sufficiently advanced that it is appropriate to think about a knowledge formalism. The objective of the syntactic grammar which we present in this article is a representation of these phenomena which take place in the context of the cell. The proposed model considers two types of objects: transcriptional units on DNA and regulatory or structural proteins which are synthesised, and which are, in the case of regulatory proteins, themselves destined to activate or repress other transcriptional units in a later phase. A transcriptional unit is described by the list of its active sites (operator, promoter, binding sites for transcription factors). A regulatory protein is described by the list of its active sites (binding domain, activation domain, binding domain for ligand). The DNA sites and the protein domains are the terminal symbols of the proposed grammar. The interaction of these proteins with the DNA, and in certain cases preliminary interactions between proteins, leads to one of two antagonistic actions: expression or repression of the transcriptional unit. These protein-protein and protein-DNA interactions are grouped into syntactic categories (induction, inhibition, initiation complex, repressor complex, activation complex) which are called biological binding operators. The expression/repression action are described by grammar rules which provide the chain of execution by biological binding operators for the four activable/repressible regulatory systems modulated by positive/negative co-factors. The object of this modelization is the observation of a cell in a given state for a given process which involves a cascade of genes. This grammar is implemented by a simulation program which allows the user to vary the initial state of the cell and also to change parameters related to time and quantity. This syntactic and generative grammar is independent of the specificity of each transcriptional unit. The simulation uses examples which may combine several regulatory systems: the lac operon, regulation of metallothionein, galactose catabolism in yeast, the tryptophan operon, and phage lysogenic/lytic cascades.

[1]  David B. Searls,et al.  The Linguistics of DNA , 1992 .

[2]  J. Collado-Vides,et al.  A transformational-grammar approach to the study of the regulation of gene expression. , 1989, Journal of theoretical biology.

[3]  J. Monod,et al.  Genetic regulatory mechanisms in the synthesis of proteins. , 1961, Journal of molecular biology.

[4]  守屋 悦朗,et al.  J.E.Hopcroft, J.D. Ullman 著, "Introduction to Automata Theory, Languages, and Computation", Addison-Wesley, A5変形版, X+418, \6,670, 1979 , 1980 .

[5]  J. Collado-Vides,et al.  A linguistic representation of the regulation of transcription initiation. I. An ordered array of complex symbols with distinctive features. , 1993, Bio Systems.

[6]  Julio Collado-Vides,et al.  The search for a grammatical theory of gene regulation is formally justified by showing the inadequacy of context-free grammars , 1991, Comput. Appl. Biosci..

[7]  Volker Brendel,et al.  Gnomic : a dictionary of genetic codes , 1986 .

[8]  R Hofestädt,et al.  Interactive modelling and simulation of biochemical networks. , 1995, Computers in biology and medicine.

[9]  E N Trifonov,et al.  Linguistic measure of taxonomic and functional relatedness of nucleotide sequences. , 1990, Journal of biomolecular structure & dynamics.

[10]  Lawrence Hunter,et al.  Artificial Intelligence and Molecular Biology , 1992, AI Mag..

[11]  Ralf Hofestädt A simulation shell to model metabolic pathways , 1993 .

[12]  V. Brendel,et al.  Genome structure described by formal languages. , 1984, Nucleic acids research.

[13]  J. Polacco,et al.  Genetic regulatory mechanisms in the fungi. , 1969, Basic life sciences.

[14]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[15]  J. Collado-Vides,et al.  A syntactic representation of units of genetic information--a syntax of units of genetic information. , 1991, Journal of theoretical biology.