Structure of a cDNA for the pro alpha 2 chain of human type I procollagen. Comparison with chick cDNA for pro alpha 2(I) identifies structurally conserved features of the protein and the gene.

Nucleotide sequences were determined for cloned cDNAs encoding for more than half of the pro alpha 2 chain of type I procollagen from man. Comparisons with previously published data on homologous cDNAs from chick embryos made it possible to examine evolution of the gene in two species which have diverged for 250-300 million years. The amino acid sequence of the alpha-chain domain supported previous indications that there is a strong selective pressure to maintain glycine as every third amino acid and to maintain a prescribed distribution of charged amino acids. However, there is little apparent selective pressure on other amino acids. The amino acid sequence of the C-propeptide domain showed less divergence than the alpha-chain domain. The 5' end or N terminus of the human C-propeptide, however, contained an insert of 12 bases coding for 4 amino acids not found in the chick C-propeptide. About 100 amino acid residues from the N terminus, two residues found in the chick sequence were missing from the human. In the second half of the C-propeptide, there was complete conservation of a 37 amino acid sequence and conservation of 50 out of 51 amino acids in the same region, an observation which suggested that the region serves some special purpose such as directing the association of one pro alpha 2(I) C-propeptide with two pro alpha 1(I) C-propeptides so as to produce the heteropolymeric structure of type I procollagen. In addition, comparison of human and chick DNAs for pro alpha 2(I) revealed three different classes of conservation of nucleotide sequence which have no apparent effect on the structure of the protein: a preference for U on the third base position of codons for glycine, proline, and alanine; a high degree of nucleotide conservation in the 51 amino acid highly conserved region of the C-propeptide; a high degree of nucleotide conservation in the 3'-noncoding region. These three classes of nucleotide conservation may reflect unusual features of collagen genes, such as their high GC content or their highly repetitive coding sequences.

[1]  A. Kang,et al.  Amino acid sequence of cyanogen bromide peptides from the amino-terminal region of chick skicollagen. , 1970, Biochemistry.

[2]  Walter Gilbert,et al.  The evolution of genes: the chicken preproinsulin gene , 1980, Cell.

[3]  P. Y. Chou,et al.  Empirical predictions of protein conformation. , 1978, Annual review of biochemistry.

[4]  I. Pastan,et al.  The exon/intron structure of the 3'-region of the pro alpha 2(I) collagen gene. , 1981, The Journal of biological chemistry.

[5]  A. Kang,et al.  The amino acid sequence of chick skin collagen alpha1-CB7. , 1975, Biochemistry.

[6]  B. Mccarthy,et al.  Identification of a Balb/c mouse pro alpha 1(I) procollagen gene: evidence for insertions or deletions in gene coding sequences. , 1981, DNA.

[7]  S. Faro,et al.  Cloning a cDNA for the pro-alpha 2 chain of human type I collagen. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Jerry L. Slightom,et al.  Base substitutions, length differences and DNA strand asymmetries in the human G γ and A γ fetal globin gene region , 1981, Cell.

[9]  K. von der Mark,et al.  The covalent structure of collagen. The amino-acid sequence of the 112-residues. Amino-terminal part of peptide 1-CB6 from calf-skin collagen. , 1972, European journal of biochemistry.

[10]  D. Hanahan,et al.  Construction and characterization of a 2.5-kilobase procollagen clone. , 1978, Proceedings of the National Academy of Sciences of the United States of America.

[11]  H. Furthmayr,et al.  Comparative sequence studies on alpha2-CB2 from calf, human, rabbit and pig-skin collagen. , 1974, European journal of biochemistry.

[12]  D. Hanahan,et al.  Structure of the pro α2(I) collagen gene , 1981, Nature.

[13]  J. Seyer,et al.  Amino acid sequence of chick skin collagen alpha 1(I)-CB8 and the complete primary structure of the helical portion of the chick skin collagen alpha 1(I) chain. , 1982, Biochemistry.

[14]  D. Hanahan,et al.  Construction and characterization of pro alpha 1 collagen complementary deoxyribonucleic acid clones. , 1979, Biochemistry.

[15]  G. Moore,et al.  Stochastic versus augmented maximum parsimony method for estimating superimposed mutations in the divergent evolution of protein sequences. Methods tested on cytochrome c amino acid sequences. , 1976, Journal of molecular biology.

[16]  P. Fietzek,et al.  The covalent structure of collagen. The amino-acid sequence of alpha2-CB4 from calf-skin collagen. , 1975, European journal of biochemistry.

[17]  J. Seyer,et al.  Covalent structure of collagen: amino-acid sequence of chymotryptic peptides from the carboxyl-terminal region of alpha2-CB3 of chick-skin collagen. , 1977, European journal of biochemistry.

[18]  I. Pastan,et al.  Construction of a recombinant bacterial plasmid containing pro-alpha 1(I) collagen DNA sequences. , 1980, The Journal of biological chemistry.

[19]  B. Olsen,et al.  Primary structure of the carbohydrate‐containing regions of the carboxyl propeptides of type I procollagen , 1981, FEBS letters.

[20]  H. Boedtker,et al.  Sequence determination and analysis of the 3' region of chicken pro-alpha 1(I) and pro-alpha 2(I) collagen messenger ribonucleic acids including the carboxy-terminal propeptide sequences. , 1981, Biochemistry.

[21]  K. Kühn,et al.  The covalent structure of collagen: Amino acid sequence of the N‐terminal region of α2‐CB5 from rat skin collagen , 1973, FEBS letters.

[22]  D. Prockop,et al.  Partial purification of a procollagen C-proteinase. Inhibition by synthetic peptides and sequential cleavage of type I procollagen. , 1982, Biochemistry.

[23]  W. Lennarz The Biochemistry of Glycoproteins and Proteoglycans , 1980, Springer US.

[24]  W. Gilbert,et al.  Sequencing end-labeled DNA with base-specific chemical cleavages. , 1980, Methods in enzymology.

[25]  P. Bornstein,et al.  Structurally distinct collagen types. , 1980, Annual review of biochemistry.

[26]  Tom Maniatis,et al.  The structure and evolution of the human β-globin gene family , 1980, Cell.

[27]  K. Kivirikko,et al.  [Biosynthesis of collagen and its disorders]. , 1979, Duodecim; laaketieteellinen aikakauskirja.

[28]  J. Seyer,et al.  Covalent structure of collagen: isolation of chymotryptic peptides and amino acid sequence of the amino-terminal region of alpha2-CB3 from chick skin. , 1977, European journal of biochemistry.

[29]  H. Hofmann,et al.  The role of polar and hydrophobic interactions for the molecular packing of type I collagen: a three-dimensional evaluation of the amino acid sequence. , 1978, Journal of molecular biology.