Sequence profiles of immunoglobulin and immunoglobulin-like domains.

Immunoglobulins (Ig) are highly modular proteins, consisting of variable and constant domains, which have clear, conserved sequence patterns. These sequence patterns have allowed T-cell receptor (TCR) and major histocompatibility complex (MHC) molecule domains, as well as some cell adhesion, cell surface receptor and muscle protein domains, to be identified as forming a superfamily of related proteins together with the Ig-domains. The domains of these proteins have been grouped into four sets: variable (V-set), constant-1 (C1-set), constant-2 (C2-set) and intermediate (I-set). X-ray and NMR studies have shown that these domains form a Greek-key beta-sandwich structure with the sets differing in the number of strands in the beta-sheets as well as in their sequence patterns. The conserved sequence elements in the major sets of Ig and Ig-like molecules have previously been reported as general sequence profiles. This work examines the variability within these sets. Detailed sequence profiles and consensus sequences for these sets and groups have been constructed and a novel form of presentation has been developed to overcome some of the drawbacks of current methods of presenting consensus sequences. The profiles that were constructed allow a comparison of the similarities and differences among the sets of Ig and Ig-like sequences and provide a means by which sequences can be tested for compatibility with Ig-like sequence motifs. As well, the sequence separations of the main residues in the characteristic "pin" structure of Ig-like molecules were examined for variation among the groups. From the profiles constructed here, measures of the degree of conservation within the groups of molecules were determined. These measures were used to assist in a reconsideration of possible evolutionary pathways between the major structural groups of the Ig-superfamily.

[1]  Charles F. Hockett,et al.  A mathematical theory of communication , 1948, MOCO.

[2]  T. T. Wu,et al.  AN ANALYSIS OF THE SEQUENCES OF THE VARIABLE REGIONS OF BENCE JONES PROTEINS AND MYELOMA LIGHT CHAINS AND THEIR IMPLICATIONS FOR ANTIBODY COMPLEMENTARITY , 1970, The Journal of experimental medicine.

[3]  A. Bourgois Evidence for an ancestral immunoglobulin gene coding for half a domain. , 1975, Immunochemistry.

[4]  J. Richardson,et al.  The anatomy and taxonomy of protein structure. , 1981, Advances in protein chemistry.

[5]  A M Lesk,et al.  Evolution of proteins formed by beta-sheets. II. The core of the immunoglobulin domains. , 1982, Journal of molecular biology.

[6]  D. K. Hawley,et al.  Compilation and analysis of Escherichia coli promoter DNA sequences. , 1983, Nucleic acids research.

[7]  W. Reznikoff,et al.  CHAPTER 1 – E. Coli Promoters , 1986 .

[8]  W. Taylor,et al.  Identification of protein sequence homology by consensus template alignment. , 1986, Journal of molecular biology.

[9]  A. F. Williams,et al.  A year in the life of the immunoglobulin superfamily. , 1987, Immunology today.

[10]  A. D. McLachlan,et al.  Profile analysis: detection of distantly related proteins. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[11]  A. Lesk,et al.  Canonical structures for the hypervariable regions of immunoglobulins. , 1987, Journal of molecular biology.

[12]  A. F. Williams,et al.  The immunoglobulin superfamily--domains for cell surface recognition. , 1988, Annual review of immunology.

[13]  T. Kaufman,et al.  Characterization of amalgam: A member of the immunoglobulin superfamily from Drosophila , 1988, Cell.

[14]  L. Hood,et al.  Implications of the diversity of the immunoglobulin gene superfamily. , 1989, Cold Spring Harbor symposia on quantitative biology.

[15]  L. Hood,et al.  Diversity of the immunoglobulin gene superfamily. , 1989, Advances in immunology.

[16]  M. Gribskov,et al.  [9] Profile analysis , 1990 .

[17]  P. Alzari,et al.  Resolution of hypervariable regions in T-cell receptor beta chains by a modified Wu-Kabat index of amino acid diversity. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[18]  T. D. Schneider,et al.  Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[19]  N. Patel,et al.  Molecular genetics of neuronal recognition in Drosophila: evolution and function of immunoglobulin superfamily cell adhesion molecules. , 1990, Cold Spring Harbor symposia on quantitative biology.

[20]  B. Erman,et al.  Information‐theoretical entropy as a measure of sequence variability , 1991, Proteins.

[21]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[22]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[23]  E. Hsu,et al.  Primary structure of immunoglobulins through evolution , 1992, Current Biology.

[24]  David I. Stuart,et al.  Crystal structure at 2.8 Å resolution of a soluble form of the cell adhesion molecule CD2 , 1992, Nature.

[25]  H M Holden,et al.  X-ray structure determination of telokin, the C-terminal domain of myosin light chain kinase, at 2.8 A resolution. , 1992, Journal of molecular biology.

[26]  D. Moerman,et al.  Products of the unc-52 gene in Caenorhabditis elegans are homologous to the core protein of the mammalian basement membrane heparan sulfate proteoglycan. , 1993, Genes & development.

[27]  E G Hutchinson,et al.  The Greek key motif: extraction, classification and analysis. , 1993, Protein engineering.

[28]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[29]  E. Jones,et al.  The immunoglobulin superfamily: Current Opinion in Structural Biology 1993, 3:846–852 , 1993 .

[30]  P Bork,et al.  The immunoglobulin fold. Structural classification, sequence patterns and common core. , 1994, Journal of molecular biology.

[31]  E. Cooper,et al.  Primordial Immunity: Foundations for the Vertebrate Immune System , 1994 .

[32]  G. Wagner,et al.  Cell surface adhesion receptors. , 1994, Current opinion in structural biology.

[33]  C Chothia,et al.  Many of the immunoglobulin superfamily domains in cell adhesion molecules and surface receptors belong to a new structural set which is close to that containing variable domains. , 1994, Journal of molecular biology.

[34]  J. Marchalonis,et al.  Development of an Immune System a , 1994, Annals of the New York Academy of Sciences.

[35]  J. Wong,et al.  Interferon induction of human tryptophanyl-tRNA synthetase safeguards the synthesis of tryptophan-rich immune-system proteins: a hypothesis. , 1995, Gene.

[36]  A. Pastore,et al.  Tertiary structure of an immunoglobulin-like domain from the giant muscle protein titin: a new member of the I set. , 1995, Structure.

[37]  J. Bajorath,et al.  Profiles for the analysis of immunoglobulin sequences: Comparison of V gene subgroups , 1995, Protein science : a publication of the Protein Society.

[38]  R L Brady,et al.  One sequence, two folds: a metastable structure of CD2. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[39]  D. I. Stuart,et al.  Crystal structure of an integrin-binding fragment of vascular cell adhesion molecule-1 at 1.8 Å resolution , 1995, Nature.

[40]  V. Berezin,et al.  The three-dimensional structure of the first domain of neural cell adhesion molecule , 1996, Nature Structural Biology.

[41]  P. Bjorkman,et al.  The (Greek) Key to Structures of Neural Adhesion Molecules , 1996, Neuron.

[42]  L. Stanfel,et al.  A new approach to clustering the amino acids. , 1996, Journal of theoretical biology.

[43]  Andrew C. R. Martin,et al.  Accessing the Kabat antibody sequence database by computer , 1996, Proteins.

[44]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[45]  Amos Bairoch,et al.  The PROSITE database, its status in 1997 , 1997, Nucleic Acids Res..