Molecular Evolution of the GATA Family of Transcription Factors: Conservation Within the DNA-Binding Domain

Abstract. The GATA-binding transcription factors comprise a protein family whose members contain either one or two highly conserved zinc finger DNA-binding domains. Members of this group have been identified in organisms ranging from cellular slime mold to vertebrates, including plants, fungi, nematodes, insects, and echinoderms. While much work has been done describing the expression patterns, functional aspects, and target genes for many of these proteins, an evolutionary analysis of the entire family has been lacking. Herein we show that only the C-terminal zinc finger (Cf) and basic domain, which together constitute the GATA-binding domain, are conserved throughout this protein family. Phylogenetic analyses of amino acid sequences demonstrate distinct evolutionary pathways. Analysis of GATA factors isolated from vertebrates suggests that the six distinct vertebrate GATAs are descended from a common ancestral sequence, while those isolated from nonvertebrates (with the exception of the fungal AREA orthologues and Arabidopsis paralogues) appear to be related only within the DNA-binding domain and otherwise provide little insight into their evolutionary history. These results suggest multiple modes of evolution, including gene duplication and modular evolution of GATA factors based upon inclusion of a class IV zinc finger motif. As such, GATA transcription factors represent a group of proteins related solely by their homologous DNA-binding domains. Further analysis of this domain examines the degree of conservation at each amino acid site using the Boltzmann entropy measure, thereby identifying residues critical to preservation of structure and function. Finally, we construct a predictive motif that can accurately identify potential GATA proteins.

[1]  D. Birnbaum,et al.  Ancient large-scale genome duplications: phylogenetic and linkage analyses shed light on chordate genome evolution. , 1998, Molecular biology and evolution.

[2]  C. Scazzocchio,et al.  Subtle hydrophobic interactions between the seventh residue of the zinc finger loop and the first base of an HGATAR sequence determine promoter‐specific recognition by the Aspergillus nidulans GATA factor AreA , 1997, The EMBO journal.

[3]  J. Burch,et al.  Transcripts for Functionally Distinct Isoforms of Chicken GATA-5 Are Differentially Expressed from Alternative First Exons* , 1997, The Journal of Biological Chemistry.

[4]  N. D. Clarke,et al.  Zinc fingers in Caenorhabditis elegans: finding families and probing pathways. , 1998, Science.

[5]  Da-Zhi Wang,et al.  FOG-2, a Heart- and Brain-Enriched Cofactor for GATA Transcription Factors , 1999, Molecular and Cellular Biology.

[6]  M. Nei,et al.  The neighbor-joining method , 1987 .

[7]  O. Yoshie,et al.  TAL1 and LIM-Only Proteins Synergistically Induce Retinaldehyde Dehydrogenase 2 Expression in T-Cell Acute Lymphoblastic Leukemia by Acting as Cofactors for GATA3 , 1998, Molecular and Cellular Biology.

[8]  A. Gronenborn,et al.  The solution structure of a fungal AREA protein-DNA complex: an alternative binding mode for the basic carboxyl tail of GATA factors. , 1998, Journal of molecular biology.

[9]  S. Orkin,et al.  Transcriptional activation and DNA binding by the erythroid factor GF-1/NF-E1/Eryf 1. , 1990, Genes & development.

[10]  K. Iatrou,et al.  Developmental regulation of a silkworm gene encoding multiple GATA-type transcription factors by alternative splicing. , 1995, Journal of molecular biology.

[11]  Sudhir Kumar,et al.  MEGA: Molecular Evolutionary Genetics Analysis software for microcomputers , 1994, Comput. Appl. Biosci..

[12]  A. Sidow Gen(om)e duplications in the evolution of early vertebrates. , 1996, Current opinion in genetics & development.

[13]  Y. Fu,et al.  nit-2, the major nitrogen regulatory gene of Neurospora crassa, encodes a protein with a putative zinc finger DNA-binding domain , 1990, Molecular and cellular biology.

[14]  S. Orkin,et al.  Increased gamma-globin expression in a nondeletion HPFH mediated by an erythroid-specific DNA-binding factor. , 1989, Nature.

[15]  H. Arst,et al.  Nitrogen regulation in Aspergillus: are two fingers better than one? , 1990, Gene.

[16]  R. Raff,et al.  Evidence for a clade of nematodes, arthropods and other moulting animals , 1997, Nature.

[17]  H. Yang,et al.  Distinct roles for the two cGATA-1 finger domains , 1992, Molecular and cellular biology.

[18]  T. D. Schneider,et al.  Information content of binding sites on nucleotide sequences. , 1986, Journal of molecular biology.

[19]  S. Orkin,et al.  Functional synergy and physical interactions of the erythroid transcription factor GATA-1 with the Krüppel family proteins Sp1 and EKLF , 1995, Molecular and cellular biology.

[20]  P. Xu,et al.  urbs1, a gene regulating siderophore biosynthesis in Ustilago maydis, encodes a protein similar to the erythroid transcription factor GATA-1. , 1993, Molecular and cellular biology.

[21]  G. Macino,et al.  White collar‐1, a central regulator of blue light responses in Neurospora, is a zinc finger protein. , 1996, The EMBO journal.

[22]  M. Mattei,et al.  A T‐cell specific TCR delta DNA binding protein is a member of the human GATA family. , 1991, The EMBO journal.

[23]  A K Hopper,et al.  SRD1, a S. cerevisiae gene affecting pre-rRNA processing contains a C2/C2 zinc finger motif. , 1994, Nucleic acids research.

[24]  F. Grosveld,et al.  The two zinc finger‐like domains of GATA‐1 have different DNA binding specificities. , 1993, The EMBO journal.

[25]  L. Patthy,et al.  Exon shuffling and other ways of module exchange. , 1996, Matrix biology : journal of the International Society for Matrix Biology.

[26]  Y. Jiang,et al.  The Xenopus GATA-4/5/6 genes are associated with cardiac specification and can regulate cardiac-specific transcription during embryogenesis. , 1996, Developmental biology.

[27]  G. Marzluf,et al.  Isolation and characterization of a new gene, sre, which encodes a GATA-type regulatory protein that controls iron transport in Neurospora crassa , 1998, Molecular and General Genetics MGG.

[28]  P. Simpson,et al.  Transcriptional activity of pannier is regulated negatively by heterodimerization of the GATA DNA-binding domain with a cofactor encoded by the u-shaped gene of Drosophila. , 1997, Genes & development.

[29]  J. Visser,et al.  Identification, cloning and sequence of the Aspergillus niger areA wide domain regulatory gene controlling nitrogen utilisation. , 1998, Biochimica et biophysica acta.

[30]  F. Grosveld,et al.  The 5′HS2 of the globin locus control region enhances transcription through the interaction of a multimeric complex binding at two functionally distinct NF‐E2 binding sites. , 1991, The EMBO journal.

[31]  S. Orkin,et al.  FOG, a Multitype Zinc Finger Protein, Acts as a Cofactor for Transcription Factor GATA-1 in Erythroid and Megakaryocytic Differentiation , 1997, Cell.

[32]  T. Cooper,et al.  Expression of the DAL80 gene, whose product is homologous to the GATA factors and is a negative regulator of multiple nitrogen catabolic genes in Saccharomyces cerevisiae, is sensitive to nitrogen catabolite repression , 1991, Molecular and cellular biology.

[33]  M. Reitman,et al.  An erythrocyte-specific DNA-binding factor recognizes a regulatory sequence common to all chicken globin genes. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[34]  P. Newell,et al.  Identification of the Cell Fate Gene Stalky in Dictyostelium , 1996, Cell.

[35]  B. Magasanik,et al.  Role of the GATA factors Gln3p and Nil1p of Saccharomyces cerevisiae in the expression of nitrogen-regulated genes. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[37]  J. Dunlap,et al.  Neurospora wc-1 and wc-2: transcription, photoresponses, and the origins of circadian rhythmicity. , 1997, Science.

[38]  A. Gronenborn,et al.  The N‐terminal fingers of chicken GATA‐2 and GATA–3 are independent sequence‐specific DNA binding domains , 1997, The EMBO journal.

[39]  M. Davis,et al.  Complementation of areA- regulatory gene mutations of Aspergillus nidulans by the heterologous regulatory gene nit-2 of Neurospora crassa. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[40]  R. J. Hill,et al.  end-1 encodes an apparent GATA factor that specifies the endoderm precursor in Caenorhabditis elegans embryos. , 1997, Genes & development.

[41]  N D Clarke,et al.  Covariation of residues in the homeodomain sequence family , 1995, Protein science : a publication of the Protein Society.

[42]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[43]  H. Arst,et al.  Cloning of the regulatory gene areA mediating nitrogen metabolite repression in Aspergillus nidulans. , 1986, The EMBO journal.

[44]  J. D. Engel,et al.  DNA-binding specificities of the GATA transcription factor family , 1993, Molecular and cellular biology.

[45]  E. Svensson,et al.  Molecular cloning of FOG-2: a modulator of transcription factor GATA-4 in cardiomyocytes. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[46]  A. Sidow,et al.  Gene duplications and the origins of vertebrate development. , 1994, Development (Cambridge, England). Supplement.

[47]  B. Magasanik,et al.  Sequence and expression of GLN3, a positive nitrogen regulatory gene of Saccharomyces cerevisiae encoding a protein with a putative zinc finger DNA-binding domain , 1991, Molecular and cellular biology.

[48]  T. Quertermous,et al.  Cooperative interaction of GATA-2 and AP1 regulates transcription of the endothelin-1 gene , 1995, Molecular and cellular biology.

[49]  I. Grosse,et al.  MEASURING CORRELATIONS IN SYMBOL SEQUENCES , 1995 .

[50]  W. Gilbert,et al.  The exon theory of genes. , 1987, Cold Spring Harbor symposia on quantitative biology.

[51]  A M Gronenborn,et al.  The solution structure of the Leu22-->Val mutant AREA DNA binding domain complexed with a TGATAG core element defines a role for hydrophobic packing in the determination of specificity. , 1998, Journal of molecular biology.

[52]  A M Gronenborn,et al.  NMR structure of a specific DNA complex of Zn-containing DNA binding domain of GATA-1. , 1993, Science.

[53]  S. Carroll,et al.  Early animal evolution: emerging views from comparative biology and geology. , 1999, Science.

[54]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[55]  László Patthy,et al.  Protein Evolution by Exon-Shuffling , 1995 .

[56]  D. Wilson,et al.  The GATA-4 transcription factor transactivates the cardiac muscle-specific troponin C promoter-enhancer in nonmuscle cells , 1994, Molecular and cellular biology.

[57]  S. Orkin,et al.  Increased γ-globin expression in a nondeletion HPFH mediated by an erythroid-specif ic DNA-binding factor , 1989, Nature.

[58]  Ramón Román-Roldán,et al.  Application of information theory to DNA sequence analysis: A review , 1996, Pattern Recognit..

[59]  A. Michelson,et al.  A molecular aspect of hematopoiesis and endoderm development common to vertebrates and Drosophila. , 1996, Development.

[60]  J. Mackay,et al.  The solution structure of the N-terminal zinc finger of GATA-1 reveals a specific binding face for the transcriptional co-factor FOG , 1999, Journal of biomolecular NMR.