The UDP glycosyltransferase gene superfamily: recommended nomenclature update based on evolutionary divergence.

This review represents an update of the nomenclature system for the UDP glucuronosyltransferase gene superfamily, which is based on divergent evolution. Since the previous review in 1991, sequences of many related UDP glycosyltransferases from lower organisms have appeared in the database, which expand our database considerably. At latest count, in animals, yeast, plants and bacteria there are 110 distinct cDNAs/genes whose protein products all contain a characteristic 'signature sequence' and, thus, are regarded as members of the same superfamily. Comparison of a relatedness tree of proteins leads to the definition of 33 families. It should be emphasized that at least six cloned UDP-GlcNAc N-acetylglucosaminyltransferases are not sufficiently homologous to be included as members of this superfamily and may represent an example of convergent evolution. For naming each gene, it is recommended that the root symbol UGT for human (Ugt for mouse and Drosophila), denoting 'UDP glycosyltransferase,' be followed by an Arabic number representing the family, a letter designating the subfamily, and an Arabic numeral denoting the individual gene within the family or subfamily, e.g. 'human UGT2B4' and 'mouse Ugt2b5'. We recommend the name 'UDP glycosyltransferase' because many of the proteins do not preferentially use UDP glucuronic acid, or their nucleotide sugar preference is unknown. Whereas the gene is italicized, the corresponding cDNA, transcript, protein and enzyme activity should be written with upper-case letters and without italics, e.g. 'human or mouse UGT1A1.' The UGT1 gene (spanning > 500 kb) contains at least 12 promoters/first exons, which can be spliced and joined with common exons 2 through 5, leading to different N-terminal halves but identical C-terminal halves of the gene products; in this scheme each first exon is regarded as a distinct gene (e.g. UGT1A1, UGT1A2, ... UGT1A12). When an orthologous gene between species cannot be identified with certainty, as occurs in the UGT2B subfamily, sequential naming of the genes is being carried out chronologically as they become characterized. We suggest that the Human Gene Nomenclature Guidelines (http://www.gene.acl.ac.uk/nomenclature/guidelines.html++ +) be used for all species other than the mouse and Drosophila. Thirty published human UGT1A1 mutant alleles responsible for clinical hyperbilirubinemias are listed herein, and given numbers following an asterisk (e.g. UGT1A1*30) consistent with the Human Gene Nomenclature Guidelines. It is anticipated that this UGT gene nomenclature system will require updating on a regular basis.