Molecular phylogeny of the kelch-repeat superfamily reveals an expansion of BTB/kelch proteins in animals

BackgroundThe kelch motif is an ancient and evolutionarily-widespread sequence motif of 44–56 amino acids in length. It occurs as five to seven repeats that form a β-propeller tertiary structure. Over 28 kelch-repeat proteins have been sequenced and functionally characterised from diverse organisms spanning from viruses, plants and fungi to mammals and it is evident from expressed sequence tag, domain and genome databases that many additional hypothetical proteins contain kelch-repeats. In general, kelch-repeat β-propellers are involved in protein-protein interactions, however the modest sequence identity between kelch motifs, the diversity of domain architectures, and the partial information on this protein family in any single species, all present difficulties to developing a coherent view of the kelch-repeat domain and the kelch-repeat protein superfamily. To understand the complexity of this superfamily of proteins, we have analysed by bioinformatics the complement of kelch-repeat proteins encoded in the human genome and have made comparisons to the kelch-repeat proteins encoded in other sequenced genomes.ResultsWe identified 71 kelch-repeat proteins encoded in the human genome, whereas 5 or 8 members were identified in yeasts and around 18 in C. elegans, D. melanogaster and A. gambiae. Multiple domain architectures were identified in each organism, including previously unrecognised forms. The vast majority of kelch-repeat domains are predicted to form six-bladed β-propellers. The most prevalent domain architecture in the metazoan animal genomes studied was the BTB/kelch domain organisation and we uncovered 3 subgroups of human BTB/kelch proteins. Sequence analysis of the kelch-repeat domains of the most robustly-related subgroups identified differences in β-propeller organisation that could provide direction for experimental study of protein-binding characteristics.ConclusionThe kelch-repeat superfamily constitutes a distinct and evolutionarily-widespread family of β-propeller domain-containing proteins. Expansion of the family during the evolution of multicellular animals is mainly accounted for by a major expansion of the BTB/kelch domain architecture. BTB/kelch proteins constitute 72 % of the kelch-repeat superfamily of H. sapiens and form three subgroups, one of which appears the most-conserved during evolution. Distinctions in propeller blade organisation between subgroups 1 and 2 were identified that could provide new direction for biochemical and functional studies of novel kelch-repeat proteins.

[1]  W. Herr,et al.  Developmental and cell-cycle regulation of Caenorhabditis elegans HCF phosphorylation. , 2001, Biochemistry.

[2]  T. Nagase,et al.  Prediction of the coding sequences of unidentified human genes. XXI. The complete sequences of 60 new cDNA clones from brain which code for large proteins. , 2001, DNA research : an international journal for rapid publication of reports on genes and genomes.

[3]  H. Avraham,et al.  NRP/B, a Novel Nuclear Matrix Protein, Associates With p110RB and Is Involved in Neuronal Differentiation , 1998, The Journal of cell biology.

[4]  Stephen M. Mount,et al.  The genome sequence of Drosophila melanogaster. , 2000, Science.

[5]  M. Yamamoto,et al.  Characterization of the Schizosaccharomyces pombe ral2 gene implicated in activation of the ras1 gene product , 1989, Molecular and cellular biology.

[6]  J Schultz,et al.  SMART, a simple modular architecture research tool: identification of signaling domains. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[7]  J. Nemes,et al.  Erratum: The SCA8 transcript is an antisense RNA to a brain-specific transcript encoding a novel actin-binding protein (KLHL1) (Human Molecular Genetics (2000) vol. 9 (1543-1551)) , 2000 .

[8]  Fabienne Thomarat,et al.  Genome sequence and gene compaction of the eukaryote parasite Encephalitozoon cuniculi , 2001, Nature.

[9]  L. Zipper,et al.  The Keap1 BTB/POZ Dimerization Function Is Required to Sequester Nrf2 in Cytoplasm* , 2002, The Journal of Biological Chemistry.

[10]  L. Cooley,et al.  Kelch encodes a component of intercellular bridges in Drosophila egg chambers , 1993, Cell.

[11]  Yuji Kohara,et al.  Large-scale analysis of gene function in Caenorhabditis elegans by high-throughput RNAi , 2001, Current Biology.

[12]  J. Mornon,et al.  The V(D)J recombination activating protein RAG2 consists of a six-bladed propeller and a PHD fingerlike domain, as revealed by sequence analysis , 1998, Cellular and Molecular Life Sciences CMLS.

[13]  C Van Hoof,et al.  Purification of porcine brain protein phosphatase 2A leucine carboxyl methyltransferase and cloning of the human homologue. , 1999, Biochemistry.

[14]  W. Herr,et al.  VP16 targets an amino-terminal domain of HCF involved in cell cycle progression , 1997, Molecular and cellular biology.

[15]  M J Sternberg,et al.  Supersites within superfolds. Binding site similarity in the absence of homology. , 1998, Journal of molecular biology.

[16]  Ira Herskowitz,et al.  Identification of Kel1p, a Kelch Domain-containing Protein Involved in Cell Fusion and Morphology in Saccharomyces cerevisiae , 1998, The Journal of cell biology.

[17]  B. Barrell,et al.  The genome sequence of Schizosaccharomyces pombe , 2002, Nature.

[18]  P. Bork,et al.  The mahogany protein is a receptor involved in suppression of obesity , 1999, Nature.

[19]  P. Freemont The RING finger. A novel protein sequence motif related to the zinc finger. , 1993, Annals of the New York Academy of Sciences.

[20]  Douglas N. Robinson,et al.  Drosophila Kelch Is an Oligomeric Ring Canal Actin Organizer , 1997, The Journal of cell biology.

[21]  N. Nomura,et al.  Characterization of cDNA clones in size-fractionated cDNA libraries from human brain. , 1997, DNA research : an international journal for rapid publication of reports on genes and genomes.

[22]  B. Hazleman,et al.  Structure of full-length porcine synovial collagenase reveals a C-terminal domain containing a calcium-linked, four-bladed beta-propeller. , 1995, Structure.

[23]  J. Hirsch,et al.  Krh1p and Krh2p act downstream of the Gpa2p Gα subunit to negatively regulate haploid invasive growth , 2003, Journal of Cell Science.

[24]  A. Murzin Structural principles for the propeller assembly of β‐sheets: The preference for seven‐fold symmetry , 1992, Proteins.

[25]  Huanming Yang,et al.  A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. indica) , 2002, Science.

[26]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[27]  David T. Jones,et al.  β Propellers: structural rigidity and functional diversity , 1999 .

[28]  N. Nomura,et al.  Characterization of cDNA clones selected by the GeneMark analysis from size-fractionated cDNA libraries from human brain. , 1999, DNA research : an international journal for rapid publication of reports on genes and genomes.

[29]  S. Mahajan,et al.  Herpes Simplex Virus Transactivator VP16 Discriminates between HCF-1 and a Novel Family Member, HCF-2 , 1999, Journal of Virology.

[30]  S. Shchelkunov,et al.  Species-Specific Differences in Organization of Orthopoxvirus Kelch-Like Proteins , 2002, Virus Genes.

[31]  M. Koenig,et al.  The gene encoding gigaxonin, a new member of the cytoskeletal BTB/kelch repeat family, is mutated in giant axonal neuropathy , 2000, Nature Genetics.

[32]  D. Leprince,et al.  The BTB/POZ domain: a new protein-protein interaction motif common to DNA- and actin-binding proteins. , 1995, Cell growth & differentiation : the molecular biology journal of the American Association for Cancer Research.

[33]  L. Cooley,et al.  The kelch repeat superfamily of proteins: propellers of cell function. , 2000, Trends in cell biology.

[34]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[35]  N. Nomura,et al.  Prediction of the coding sequences of unidentified human genes. VII. The complete sequences of 100 new cDNA clones from brain which can code for large proteins in vitro. , 1997, DNA research : an international journal for rapid publication of reports on genes and genomes.

[36]  R. Petralia,et al.  Actinfilin, a Brain-specific Actin-binding Protein in Postsynaptic Density* , 2002, The Journal of Biological Chemistry.

[37]  T. Nagase,et al.  Prediction of the coding sequences of unidentified human genes. XVI. The complete sequences of 150 new cDNA clones from brain which code for large proteins in vitro. , 2000, DNA research : an international journal for rapid publication of reports on genes and genomes.

[38]  G. Freeman,et al.  Attractin (DPPT-L), a member of the CUB family of cell adhesion and guidance proteins, is secreted by activated human T lymphocytes and modulates immune cell interactions. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[39]  M. Andrade,et al.  A combination of the F-box motif and kelch repeats defines a large Arabidopsis family of F-box proteins , 2001, Plant Molecular Biology.

[40]  M. Tyers,et al.  The F-box: a new motif for ubiquitin dependent proteolysis in cell cycle regulation and signal transduction. , 1999, Progress in biophysics and molecular biology.

[41]  S. Phillips,et al.  Crystal structure of a free radical enzyme, galactose oxidase. , 1994, Journal of molecular biology.

[42]  P. Palese,et al.  NS1-Binding Protein (NS1-BP): a Novel Human Protein That Interacts with the Influenza A Virus Nonstructural NS1 Protein Is Relocalized in the Nuclei of Infected Cells , 1998, Journal of Virology.

[43]  Y. Ohinata,et al.  A novel testis‐specific RAG2‐like protein, Peas: its expression in pachytene spermatocyte cytoplasm and meiotic chromatin , 2003, FEBS letters.

[44]  S. Pfeffer,et al.  A Novel Rab9 Effector Required for Endosome-to-TGN Transport , 1997, The Journal of cell biology.

[45]  Y. Iino,et al.  kel‐1, a novel Kelch‐related gene in Caenorhabditis elegans, is expressed in pharyngeal gland cells and is required for the feeding process , 1999, Genes to cells : devoted to molecular & cellular mechanisms.

[46]  K. Ying,et al.  Cloning and Characterization of KLHL5, a Novel Human Gene Encoding a Kelch-Related Protein with a BTB Domain , 2001, Biochemical Genetics.

[47]  E. Jabs,et al.  The IPP gene is assigned to human chromosome 1p32-1p22. , 1993, Genomics.

[48]  E. Mauceli,et al.  The genome sequence of the filamentous fungus Neurospora crassa , 2003, Nature.

[49]  T. Nagase,et al.  Prediction of the coding sequences of unidentified human genes. XX. The complete sequences of 100 new cDNA clones from brain which code for large proteins in vitro. , 2001, DNA research : an international journal for rapid publication of reports on genes and genomes.

[50]  D. E. Somers Clock-associated genes in Arabidopsis: a family affair. , 2001, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[51]  P Bork,et al.  Drosophila kelch motif is derived from a common enzyme fold. , 1994, Journal of molecular biology.

[52]  Philipp Bucher,et al.  The discoidin domain family revisited: New members from prokaryotes and a homology‐based fold prediction , 1998, Protein science : a publication of the Protein Society.

[53]  B. Ozanne,et al.  Krp1, a novel kelch related protein that is involved in pseudopod elongation in transformed cells , 2000, Oncogene.

[54]  N. Niikawa,et al.  Isolation and characterization of a novel gene deleted in DiGeorge syndrome. , 1995, Human molecular genetics.

[55]  Massimo Paoli,et al.  Novel sequences propel familiar folds. , 2002, Structure.

[56]  Peer Bork,et al.  Comparative Genome and Proteome Analysis of Anopheles gambiae and Drosophila melanogaster , 2002, Science.

[57]  L. Cooley,et al.  Examination of the function of two kelch proteins generated by stop codon suppression. , 1997, Development.

[58]  J. Weill,et al.  Specific over-expression of deltex and a new Kelch-like protein in human germinal center B cells. , 2003, Molecular immunology.

[59]  A. Andreeva,et al.  Protein Ser/Thr phosphatases with kelch-like repeat domains. , 2002, Cellular signalling.

[60]  S. Shchelkunov,et al.  The genomic sequence analysis of the left and right species-specific terminal region of a cowpox virus strain reveals unique sequences and a cluster of intact ORFs for immunomodulatory and host range proteins. , 1998, Virology.

[61]  D. T. Jones,et al.  Beta propellers: structural rigidity and functional diversity. , 1999, Current opinion in structural biology.

[62]  B. Qiang,et al.  Inhibition of LZIP-mediated Transcription through Direct Interaction with a Novel Host Cell Factor-like Protein* , 2001, The Journal of Biological Chemistry.

[63]  N. Nomura,et al.  Prediction of the coding sequences of unidentified human genes. IX. The complete sequences of 100 new cDNA clones from brain which can code for large proteins in vitro. , 1998, DNA research : an international journal for rapid publication of reports on genes and genomes.

[64]  J. Berg Genome sequence of the nematode C. elegans: a platform for investigating biology. , 1998, Science.

[65]  R. Huang,et al.  Isolation and characterization of IPP, a novel human gene encoding an actin-binding, kelch-like protein. , 1999, Gene.

[66]  S. Sadiev,et al.  DNA sequence and muscle-specific expression of human sarcosin transcripts , 1998, Molecular and Cellular Biochemistry.

[67]  C. Ponting,et al.  Protein repeats: structures, functions, and evolution. , 2001, Journal of structural biology.

[68]  Benjamin A. Shoemaker,et al.  CDD: a database of conserved domain alignments with links to domain three-dimensional structure , 2002, Nucleic Acids Res..

[69]  T. Hunt,et al.  The crystal structure of cyclin A. , 1995, Structure.

[70]  Jonathan E. Allen,et al.  Genome sequence of the human malaria parasite Plasmodium falciparum , 2002, Nature.

[71]  B. Barrell,et al.  Life with 6000 Genes , 1996, Science.

[72]  G. Rubin,et al.  Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[73]  H. Mewes,et al.  Toward a catalog of human genes and proteins: sequencing and analysis of 500 novel complete protein coding human cDNAs. , 2001, Genome research.

[74]  A. Oliphant,et al.  A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). , 2002, Science.

[75]  S. Ward,et al.  The Caenorhabditis elegans spe-26 gene is necessary to form spermatids and encodes a protein similar to the actin-associated proteins kelch and scruin. , 1995, Genes & development.

[76]  M. Hatano,et al.  Identification of Nd1, a Novel Murine Kelch Family Protein, Involved in Stabilization of Actin Filaments* , 2002, The Journal of Biological Chemistry.

[77]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[78]  Josephine C. Adams,et al.  Characterization of a Drosophila melanogaster orthologue of muskelin. , 2002, Gene.

[79]  M. Israel,et al.  Cloning of human ENC-1 and evaluation of its expression and regulation in nervous system tumors. , 1998, Experimental cell research.

[80]  J. Adams,et al.  cDNA cloning of human muskelin and localisation of the muskelin (MKLN1) gene to human chromosome 7q32 and mouse chromosome 6 B1/B2 by physical mapping and FISH , 1999, Cytogenetic and Genome Research.

[81]  J. D. Engel,et al.  Keap1 represses nuclear activation of antioxidant responsive elements by Nrf2 through binding to the amino-terminal Neh2 domain. , 1999, Genes & development.

[82]  J. Nemes,et al.  The SCA8 transcript is an antisense RNA to a brain-specific transcript encoding a novel actin-binding protein (KLHL1). , 2000, Human molecular genetics.

[83]  J. Heitman,et al.  The Gα Protein Gpa2 Controls Yeast Differentiation by Interacting with Kelch Repeat Proteins that Mimic Gβ Subunits , 2002 .

[84]  S. Goebel,et al.  Deletion of 55 open reading frames from the termini of vaccinia virus. , 1991, Virology.

[85]  M. Ross,et al.  Identification and characterization of KLHL4, a novel human homologue of the Drosophila Kelch gene that maps within the X-linked cleft palate and Ankyloglossia (CPX) critical region. , 2001, Genomics.

[86]  Jian Wang,et al.  The Genome Sequence of the Malaria Mosquito Anopheles gambiae , 2002, Science.

[87]  Alfred Wittinghofer,et al.  The 1.7 Å crystal structure of the regulator of chromosome condensation (RCC1) reveals a seven-bladed propeller , 1998, Nature.

[88]  W. Herr,et al.  Selected Elements of Herpes Simplex Virus Accessory Factor HCF Are Highly Conserved in Caenorhabditis elegans , 1999, Molecular and Cellular Biology.

[89]  T. Hasson,et al.  A human homologue of Drosophila kelch associates with myosin-VIIa in specialized adhesion junctions. , 2002, Cell motility and the cytoskeleton.

[90]  Kazuo Shinozaki,et al.  Classification and expression analysis of Arabidopsis F-box-containing protein genes. , 2002, Plant & cell physiology.

[91]  M. L. Le Beau,et al.  Molecular characterization of KLHL3, a human homologue of the Drosophila kelch gene. , 2000, Genomics.

[92]  K. Okumura,et al.  Characterization of long cDNA clones from human adult spleen. , 2000, DNA Research.

[93]  Temple F. Smith,et al.  Thirty‐plus functional families from a single motif , 2000, Protein science : a publication of the Protein Society.

[94]  Alexander Souvorov,et al.  Genomic BLAST: custom-defined virtual databases for complete and unfinished genomes. , 2002, FEMS microbiology letters.

[95]  Ronald W. Davis,et al.  The mouse mahogany locus encodes a transmembrane form of human attractin , 1999, Nature.

[96]  N. Nomura,et al.  Identification of high-molecular-weight proteins with multiple EGF-like motifs by motif-trap screening. , 1998, Genomics.

[97]  A. Izeta,et al.  Primary structure and compartmentalization of Drosophila melanogaster host cell factor. , 2003, Gene.

[98]  P. Nurse,et al.  tea1 and the Microtubular Cytoskeleton Are Important for Generating Global Spatial Order within the Fission Yeast Cell , 1997, Cell.

[99]  W. Franke,et al.  Molecular nature of calicin, a major basic protein of the mammalian sperm head cytoskeleton. , 1995, Experimental cell research.

[100]  Temple F. Smith,et al.  The WD repeat: a common architecture for diverse functions. , 1999, Trends in biochemical sciences.

[101]  Robert A. H. White,et al.  Characterization of Mayven, a novel actin-binding protein predominantly expressed in brain. , 1999, Molecular biology of the cell.