The structure and evolution of the human β-globin gene family

Argiris Efstratiadis Department of Biological Chemistry Harvard Medical School Boston, Massachusetts 02115 James W. Posakony, Tom Maniatis, Richard M. Lawn* and Catherine O’Connell+ Division of Biology California Institute of Technology Pasadena, California 91125 Richard A. Spritz, Jon K. DeRiel,# Bernard G. Forget and Sherman M. Weissman Departments of Genetics and Internal Medicine Yale University School of Medicine New Haven, Connecticut 06510 Jerry L. Slightom, Ann E. Blechl and Oliver Smithies Laboratory of Genetics University of Wisconsin Madison, Wisconsin 53706 Francisco E. Baralle, Carol C. Shoulders and Nicholas J. ProudfootQ MRC Laboratory of Molecular Biology Hills Road Cambridge CB2 2QH, England Summary We present the results of a detailed comparison of the primary structure of human p-like globin genes and their flanking sequences. Among the se- quences located 5’ to these genes are two highly conserved regions which include the sequences ATA and CCAAT located 31 2 1 and 77 + 10 bp, respectively, 5’ to the mRNA capping site. Similar sequences are found in the corresponding locations in most other eucaryotic structural genes. Calcula- tion of the divergence times of individual @like globin gene pairs provides the first description of the evolutionary relationships within a gene family based entirely on direct nucleotide sequence com- parisons. In addition, the evolutionary relationship of the embryonic e-globin gene to the other human P-like globin genes is defined for the first time. Finally, we describe a model for the involvement of short direct repeat sequences in the generation of deletions in the noncoding and coding regions of B-like globin genes during evolution.

[1]  W. Gilbert INTRONS AND EXONS: PLAYGROUNDS OF EVOLUTION , 1979 .

[2]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[3]  F. Blattner,et al.  Cloning human fetal gamma globin and mouse alpha-type globin DNA: characterization and partial sequencing. , 1978, Science.

[4]  F. Baralle,et al.  AUG is the only recognisable signal sequence in the 5′ non-coding regions of eukaryotic mRNA , 1978, Nature.

[5]  P. Leder,et al.  The complete sequence of a chromosomal mouse α-globin gene reveals elements conserved throughout vertebrate evolution , 1979, Cell.

[6]  G. Moore,et al.  Stochastic versus augmented maximum parsimony method for estimating superimposed mutations in the divergent evolution of protein sequences. Methods tested on cytochrome c amino acid sequences. , 1976, Journal of molecular biology.

[7]  A. van der Eb,et al.  The nucleotide sequence of the transforming HpaI-E fragment of adenovirus type 5 DNA. , 1978, Gene.

[8]  C. Shoulders,et al.  The primary structure of the human ϵ-globin gene , 1980, Cell.

[9]  A. Jeffreys,et al.  Localisation of the Gγ-, Aγ-, δ- and β-globin genes on the short arm of human chromosome 11 , 1979, Nature.

[10]  H. Daetwyler,et al.  Genes and spacers of cloned sea urchin histone DNA analyzed by sequencing , 1978, Cell.

[11]  M. Kozak,et al.  How do eucaryotic ribosomes select initiation regions in messenger RNA? , 1978, Cell.

[12]  Malcolm G. McKenna,et al.  THE ORIGIN AND EARLY DIFFERENTIATION OF THERIAN MAMMALS , 1969 .

[13]  N. Proudfoot,et al.  3′ Non-coding region sequences in eukaryotic messenger RNA , 1976, Nature.

[14]  Y. Kan,et al.  Organization of the alpha-globin genes in the Chinese alpha-thalassemia syndromes. , 1979, The Journal of clinical investigation.

[15]  Y. Kan,et al.  Sequence of the 3'-noncoding and adjacent coding regions of human gamma-globin mRNA. , 1978, Nucleic acids research.

[16]  C Benoist,et al.  Ovalbumin gene: evidence for a leader sequence in mRNA and DNA sequences at the exon-intron boundaries. , 1978, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Stanley N Cohen,et al.  Structure and genomic organization of the mouse dihydrofolate reductase gene , 1980, Cell.

[18]  J. Lingrel,et al.  Unusual sequence homology at the 5-ends of the developmentally regulated beta A-, beta C-, and gamma-globin genes of the goat. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[19]  R. Roeder,et al.  Selective and accurate initiation of transcription at the ad2 major late promotor in a soluble system dependent on purified rna polymerase ii and dna , 1979, Cell.

[20]  C. Blake,et al.  Exons encode protein functional units , 1979, Nature.

[21]  C Benoist,et al.  The ovalbumin gene-sequence of putative control regions , 1980, Nucleic Acids Res..

[22]  A. V. van Ooyen,et al.  Comparison of total sequence of a cloned rabbit beta-globin gene and its flanking regions with a homologous mouse sequence. , 1979, Science.

[23]  N. Proudfoot Complete 3′ noncoding region sequences of rabbit and human β-globin messenger RNAs , 1977, Cell.

[24]  D. Housman,et al.  Precise localization of human beta-globin gene complex on chromosome 11. , 1979, Proceedings of the National Academy of Sciences of the United States of America.

[25]  P. A. Biro,et al.  Restriction endonuclease mapping of the human γ globin gene loci , 1979 .

[26]  A. Burns,et al.  Isolation and characterization of cloned human fetal globin genes. , 1979, Nucleic acids research.

[27]  Y. Tsujimoto,et al.  Structural analysis of the fibroin gene at the 5′ end and its surrounding regions , 1979, Cell.

[28]  Y. Kan,et al.  The nucleotide sequence of the 5′ untranslated region of human γ-globin mRNA , 1978 .

[29]  T. Maniatis,et al.  Molecular cloning and characterization of the human β-like globin gene cluster , 1980, Cell.

[30]  J. Nevins,et al.  The major late adenovirus type-2 transcription unit: termination is downstream from the last poly(A) site. , 1979, Journal of molecular biology.

[31]  L. Kedes,et al.  The DNA sequence of sea urchin (S. purpuratus) H2A, H2B and H3 histone coding and spacer regions , 1978, Cell.

[32]  Howard M. Goodman,et al.  Sequence of the human insulin gene , 1980, Nature.

[33]  A. Friday,et al.  Molecular Evolution of Myoglobin and the Fossil Record: a Phylogenetic Synthesis , 1973, Nature.

[34]  C. Craik,et al.  Characterization of globin domains: heme binding to the central exon product. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[35]  William A. Eaton,et al.  The relationship between coding sequences and function in haemoglobin , 1980, Nature.

[36]  P. Leder,et al.  The sequence of the chromosomal mouse β-globin major gene: Homologies in capping, splicing and poly(A) sites , 1978, Cell.

[37]  P. Sharp,et al.  DNA-dependent transcription of adenovirus genes in a soluble whole-cell extract. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[38]  S. Orkin The duplicated human alpha globin genes lie close together in cellular DNA. , 1978, Proceedings of the National Academy of Sciences of the United States of America.

[39]  R. Grosschedl,et al.  Identification of regulatory sequences in the prelude sequences of an H2A histone gene by the study of specific deletion mutants in vivo. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[40]  P. Bucher,et al.  Sea urchin histone mRNA termini are located in gene regions downstream from putative regulatory sequences , 1980, Nature.

[41]  W. Salser,et al.  The primary sequence of rabbit α-globin mRNA , 1978, Cell.

[42]  S. Weissman,et al.  Human beta-globin messenger RNA. III. Nucleotide sequences derived from complementary DNA. , 1977, The Journal of biological chemistry.

[43]  J. Maizel,et al.  The evolution and sequence comparison of two recently diverged mouse chromosomal β-globin genes , 1979, Cell.

[44]  W. Rutter,et al.  Isolation and characterization of a cloned rat insulin gene , 1979, Cell.

[45]  Ann E. Blechl,et al.  Human fetal g γ- and A γ-globin genes: Complete nucleotide sequences suggest that DNA can be exchanged between these duplicated genes , 1980, Cell.

[46]  R. Williamson,et al.  Analysis of the β-δ-globin gene loci in normal and hb lepore DNA: Direct determination of gene linkage and intergene distance , 1978, Cell.

[47]  J. Seidman,et al.  Intervening sequence of DNA identified in the structural portion of a mouse beta-globin gene. , 1978, Proceedings of the National Academy of Sciences of the United States of America.

[48]  D Court,et al.  Regulatory sequences involved in the promotion and termination of RNA transcription. , 1979, Annual review of genetics.

[49]  T. Maniatis,et al.  The nucleotide sequence of the human β-globin gene , 1980, Cell.

[50]  S. Weissman,et al.  STRUCTURE OF THE HUMAN GLOBIN GENES , 1979 .

[51]  J. Lawrence,et al.  Localization of the human α-globin structural gene to chromosome 16 in somatic cell hybrids by molecular hybridization assay , 1977, Cell.

[52]  Tom Maniatis,et al.  The structure of a human α-globin pseudogene and its relationship to α-globin gene duplication , 1980, Cell.

[53]  N. Proudfoot,et al.  Molecular cloning of human epsilon-globin gene. , 1979, Proceedings of the National Academy of Sciences of the United States of America.

[54]  S. Weissman,et al.  Complete nucleotide sequence of the human δ-globin gene , 1980, Cell.

[55]  W. Boll,et al.  Rabbit β-globin mRNA production in mouse L cells transformed with cloned rabbit β-globin chromosomal DNA , 1979, Nature.

[56]  G. Khoury,et al.  Deletion mutants of simian virus 40 defective in biosynthesis of late viral mRNA. , 1979, Proceedings of the National Academy of Sciences of the United States of America.

[57]  J. Nevins,et al.  Steps in the processing of Ad2 mRNA: Poly(A)+ Nuclear sequences are conserved and poly(A) addition precedes splicing , 1978, Cell.

[58]  P. Leder,et al.  SV40 recombinants carrying a functional RNA splice junction and polyadenylation site from the chromosomal mouse βmaj globin gene , 1979, Cell.

[59]  M. N. Kronenberg,et al.  The 3' noncoding region of beta-globin mRNA is not essential for in vitro translation , 1979, Nucleic Acids Res..

[60]  G. Khoury,et al.  BKV splice sequences based on analysis of preferred donor and acceptor sites. , 1979, Nucleic acids research.

[61]  P. Leder,et al.  Expression of the chromosomal mouse βmaj-globin gene cloned in SV40 , 1979, Nature.

[62]  R. Williamson,et al.  Structure of the human fetal globin gene locus , 1979, Nature.

[63]  L. Kedes,et al.  Leader sequences of Strongylocentrotus purpuratus histone mRNAs start at a unique heptanucleotide common to all five histone genes. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[64]  D. Dressler,et al.  Regions of single-stranded DNA in the growing points of replicating bacteriophage T7 chromosomes. , 1972, Proceedings of the National Academy of Sciences of the United States of America.

[65]  D. Pribnow Genetic Control Signals in DNA , 1979 .

[66]  J. Miller,et al.  Genetic studies of the lac repressor. VII. On the molecular nature of spontaneous hotspots in the lacI gene of Escherichia coli. , 1978, Journal of molecular biology.

[67]  N. Rosenthal,et al.  The structure and transcription of four linked rabbit β-like globin genes , 1979, Cell.

[68]  A. Jeffreys,et al.  A physical map of the DNA regions flanking the rabbit β-globin gene , 1977, Cell.

[69]  Tom Maniatis,et al.  Transformation of mammalian cells with genes from procaryotes and eucaryotes , 1979, Cell.

[70]  D. J. Weatherall,et al.  Recent developments in the molecular genetics of human hemoglobin , 1979, Cell.

[71]  C. Hutchison,et al.  DNA sequence organization of the β-globin complex in the BALB/c mouse , 1980, Cell.

[72]  Arthur Bank,et al.  Organization of human δ- and β-globin genes in cellular DNA and the presence of intragenic inserts , 1978, Cell.

[73]  J. Lawrence,et al.  Chromosomal localization of human β globin gene on human chromosome 11 in somatic cell hybrids , 1978 .

[74]  P. Little,et al.  Polymorphisms of human γ-globin genes in Mediterranean populations , 1979, Nature.

[75]  A. Means,et al.  Molecular structure and flanking nucleotide sequences of the natural chicken ovomucoid gene , 1979, Cell.

[76]  P. Leder,et al.  The Sequence of the Chromosomal Mouse /3-Globin Major Gene: Homologies in Capping, Splicing and , 1978 .

[77]  A. Jeffreys DNA sequence variants in the G γ-, A γ-, δ- and β-globin genes of man , 1979, Cell.

[78]  Y W Kan,et al.  Assignment of human beta-, gamma-, and delta-globin genes to the short arm of chromosome 11 by chromosome sorting and DNA restriction enzyme analysis. , 1979, Proceedings of the National Academy of Sciences of the United States of America.

[79]  M. Inouye,et al.  Frameshift mutations and the genetic code. This paper is dedicated to Professor Theodosius Dobzhansky on the occasion of his 66th birthday. , 1966, Cold Spring Harbor symposia on quantitative biology.

[80]  Judith A. Kantor,et al.  Beta thalassemia: Mutations which affect processing of the β-globin mRNA precursor , 1980, Cell.

[81]  T. Maniatis,et al.  The molecular genetics of human hemoglobins. , 1980, Annual review of genetics.

[82]  T. Maniatis,et al.  The chromosomal arrangement of human α-like globin genes: Sequence homology and α-globin gene deletions , 1980, Cell.

[83]  F. Galibert,et al.  Messenger rna for the ad2 dna binding protein: dna sequences encoding the first leader and heterogeneity at the mRNA 5′ end , 1979, Cell.

[84]  R. Hardison,et al.  The linkage arrangement of four rabbit β-like globin genes , 1979, Cell.

[85]  A. Ullrich,et al.  Molecular cloning and sequence analysis of adult chicken β globin cDNA , 1979 .

[86]  Stephen M. Mount,et al.  Are snRNPs involved in splicing? , 1980, Nature.

[87]  P. Chambon,et al.  Specific in vitro initiation of transcription on conalbumin and ovalbumin genes and comparison with adenovirus-2 early and late genes , 1980, Nature.

[88]  Walter Gilbert,et al.  The evolution of genes: the chicken preproinsulin gene , 1980, Cell.

[89]  W. Salser Globin mRNA sequences: analysis of base pairing and evolutionary implications. , 1978, Cold Spring Harbor symposia on quantitative biology.

[90]  T. Maniatis,et al.  Characterisation of deletions which affect the expression of fetal globin genes in man , 1979, Nature.

[91]  T. Maniatis,et al.  The isolation and characterization of linked δ- and β-globin genes from a cloned library of human DNA , 1978, Cell.

[92]  R. Williamson,et al.  Structure of the human G gamma-A gamma-delta-beta-globin gene locus. , 1979, Proceedings of the National Academy of Sciences of the United States of America.

[93]  P. Gruss,et al.  Splicing as a requirement for biogenesis of functional 16S mRNA of simian virus 40. , 1979, Proceedings of the National Academy of Sciences of the United States of America.