A new function evolved from gene fusion.

What constitutes genetic difference among organisms? How do new gene functions originate in nature? Since the early days of molecular biology, we have known that homologous genes between species differ in DNA and protein sequence. Noncoding regions have also been evolving with repetitive sequences, transposable elements, and other elements continuously reshaping genomes of organisms. As more genomes of humans and other organisms are examined, it also becomes clear that species differ not only in these two genomic parameters but also in the number and kinds of genes. Genes are subject to a life and death process: New genes have originated continuously throughout evolution. For example, Drosophila melanogaster contains 87 cuticle protein genes, while Caenorhabditis elegans contains no such genes in its genome (Rubin et al. 2000). If this is thought to be comparing toodivergent organisms, take a look at recently divergent sibling species. Drosophila teisseiri and Drosophila yakuba contain a gene called jingwei (Long and Langley 1993; Wang et al. 2000), which originated only 2.5 million years ago. D. melanogaster itself has a unique gene Sdic, which expresses particularly in the sperm tail and does not exist in even its closest relative species (Nurminsky et al. 1998). New genes often give rise to new biological functions driven by adaptive Darwinian selection (Long and Langley 1993; Chen et al. 1997; Begun 1997; Nurminsky et al. 1998). New genes may even have controlled the origination of new species, for example, Odysseus, a homeobox duplicate gene in Drosophila (Ting et al. 1998). Such new genes are associated with two conspicuous changes consistent with origin of new functions: High protein substitution rates and drastic changes in gene structure. Drosophila is not the only organism whose genome has been found to originate new protein-coding genes differentiating one species from another. Other organisms, including plants and mammals, also have newly originated genes. For example, the Mus musculus genome contains multiple copies of the new gene SP100-rs, which is absent in its sibling species Mus caroli (Weichenhan et al. 1998), though little detail of its evolution and function is known. In potato, a new cytochrome c1 originated a mitochondrial targeting function (Long et al. 1996). Retrosequences may have contributed to the origin of new vertebrate regulatory elements or new parts of vertebrate coding regions (Brosius 1999). In these cases, recombination of protein modules and gene duplication played essential roles in creating the initial gene structures, and natural selection participated in the subsequent evolution. Although insights from young chimerical genes in Drosophila have enormously changed our views of new gene evolution, good data from humans or mammals have been lacking. This is a significant hurdle for understanding new gene evolution in the genetic systems of the human and its primate relatives. In this issue, Thompson et al. (2000) present a clear example of how new genes with novel functions can originate in humans and other mammals, including the molecular process and derived biological function. A closer look at the origination of this new gene, Kua-UEV, offers insights into the general problem of human gene origination. UEV is a conserved gene, distributed across all major eukaryotic lineages ranging from animals to fungi, plants, and protozoa. The UEV proteins in these organisms share multiple functions, for example, cell protection, c-FOS transcription, and cell-cycle progression (Sancho et al. 1998; Thomson et al. 1998; Xiao et al. 1998). In Saccharomyces cerevisiae, the UEV protein controls elongation of polyubiquitin chains when associated with ubiquitin-conjugating enzymes (E2; Hoffman and Pickart 1999). The UEV genes in divergent organisms have maintained a very conserved structure in its common domain (C domain). However, there exists an additional domain (B domain) in one isoform of the human gene that does not exist in other organisms and, thus, creates a new, chimerical gene structure. How did this new structure originate, and where does the B domain come from? From the first glimpse, this human gene is reminiscent of the chimerical structure of two Drosophila young genes. The first example is jingwei, which is composed of a major domain and an additional N-terminal domain (Long and Langley 1993). Recent work implies that the mosaic structure of jingwei was created by insertion of the retrosequence of the alcohol dehydrogenase gene into a previously existing gene, recruiting a portion of the N-terminal domain (Long et al. 1999; Wang et al. 2000). The second example is Sdic, which was created by a deletion in two adjacent genes at the DNA level (Nurminsky et al. 1998). However, the human UEV gene seems to have taken a different evolutionary route to acquire its additional B domain (Fig. 1). In the genomic databases of D. melananogster and C. elegans, two small DNA fragments unrelated to the UEV gene in these species were found to be significantly similar to the B domain of the human UEV gene. Further analysis showed that these are seven exons encoding a 319–amino acid protein in C. elegans and five exons encoding a 326– amino acid protein in D. melanogaster. This newly discovered gene, named Kua (derived from the word “Cua” in Catalan, which means “tail” or “queue”) enE-MAIL mlong@midway.uchicago.edu; FAX (773)702-9740. Article and publication are at www.genome.org/cgi/ doi/10.1101/gr.165700 Insight/Outlook

[1]  B. Bainbridge,et al.  Genetics , 1981, Experientia.

[2]  M. Saraste,et al.  FEBS Lett , 2000 .

[3]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.