Classification of the short‐chain dehydrogenase/reductase superfamily using hidden Markov models

The short‐chain dehydrogenase/reductase (SDR) superfamily now has over 47 000 members, most of which are distantly related, with typically 20–30% residue identity in pairwise comparisons, making it difficult to obtain an overview of this superfamily. We have therefore developed a family classification system, based upon hidden Markov models (HMMs). To this end, we have identified 314 SDR families, encompassing about 31 900 members. In addition, about 9700 SDR forms belong to families with too few members at present to establish valid HMMs. In the human genome, we find 47 SDR families, corresponding to 82 genes. Thirteen families are present in all three domains (Eukaryota, Bacteria, and Archaea), and are hence expected to catalyze fundamental metabolic processes. The majority of these enzymes are of the ‘extended’ type, in agreement with earlier findings. About half of the SDR families are only found among bacteria, where the ‘classical’ SDR type is most prominent. The HMM‐based classification is used as a basis for a sustainable and expandable nomenclature system.

[1]  B. Persson,et al.  Medium- and short-chain dehydrogenase/reductase gene and protein families , 2008, Cellular and Molecular Life Sciences.

[2]  J. Adamski,et al.  Multifunctionality of human 17β-hydroxysteroid dehydrogenases , 2006, Molecular and Cellular Endocrinology.

[3]  K. Kavanagh,et al.  Structure and function of human 17beta-hydroxysteroid dehydrogenases. , 2006, Molecular and cellular endocrinology.

[4]  Y. Kallberg,et al.  Prediction of coenzyme specificity in dehydrogenases/ reductases , 2006 .

[5]  Rodrigo Lopez,et al.  Multiple sequence alignment with the Clustal series of programs , 2003, Nucleic Acids Res..

[6]  Janet M Thornton,et al.  The SDR (short-chain dehydrogenase/reductase and related enzymes) nomenclature initiative. , 2009, Chemico-biological interactions.

[7]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Y. Kallberg,et al.  Short‐chain dehydrogenase/reductase (SDR) relationships: A large family with eight clusters common to human, animal, and plant genomes , 2002, Protein science : a publication of the Protein Society.

[9]  L. Sturla,et al.  Synthesis of GDP-L-fucose by the Human FX Protein* , 1996, The Journal of Biological Chemistry.

[10]  Erik Nordling,et al.  Critical Residues for Structure and Catalysis in Short-chain Dehydrogenases/Reductases* , 2002, The Journal of Biological Chemistry.

[11]  U. Oppermann,et al.  Metabolic conversion as a pre-receptor control mechanism for lipophilic hormones. , 2001, European journal of biochemistry.

[12]  J. Adamski,et al.  Multifunctionality of human 17beta-hydroxysteroid dehydrogenases. , 2006, Molecular and cellular endocrinology.

[13]  Kimmen Sjölander,et al.  Phylogenomic Inference of Protein Molecular Function , 2005, Current protocols in bioinformatics.

[14]  J. Adamski,et al.  In search for function of two human orphan SDR enzymes: Hydroxysteroid dehydrogenase like 2 (HSDL2) and short-chain dehydrogenase/reductase-orphan (SDR-O) , 2009, The Journal of Steroid Biochemistry and Molecular Biology.

[15]  Yvonne Kallberg,et al.  Short-chain dehydrogenases/reductases (SDRs). , 2002, European journal of biochemistry.

[16]  Andrew M. Jenkinson,et al.  Ensembl 2009 , 2008, Nucleic Acids Res..

[17]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[18]  M. Baker,et al.  Evolution of 17β-hydroxysteroid dehydrogenases and their role in androgen, estrogen and retinoid action , 2001, Molecular and Cellular Endocrinology.

[19]  Y. Kallberg,et al.  Prediction of coenzyme specificity in dehydrogenases/reductases. A hidden Markov model-based method and its application on complete genomes. , 2006, The FEBS journal.

[20]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[21]  J. Fridovich-Keil,et al.  Molecular Basis for Severe Epimerase Deficiency Galactosemia , 2001, The Journal of Biological Chemistry.

[22]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[23]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[24]  B. Persson,et al.  SDR and MDR: completed genome sequences show these protein families to be large, of old origin, and of complex nature , 1999, FEBS letters.

[25]  M Krook,et al.  Characteristics of short-chain alcohol dehydrogenases and related enzymes. , 1991, European journal of biochemistry.

[26]  R. Kumar,et al.  Molecular Cloning of Human GDP-mannose 4,6-Dehydratase and Reconstitution of GDP-fucose Biosynthesis in Vitro * , 1998, The Journal of Biological Chemistry.

[27]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[28]  Benjamin J. Raphael,et al.  The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families , 2007, PLoS biology.