Classification of the short-chain dehydrogenase/reductase superfamily using hidden Markov models.
Kallberg Y., Oppermann U., Persson B.
The short-chain dehydrogenase/reductase (SDR) superfamily now has over 47 000 members, most of which are distantly related, with typically 20-30% residue identity in pairwise comparisons, making it difficult to obtain an overview of this superfamily. We have therefore developed a family classification system, based upon hidden Markov models (HMMs). To this end, we have identified 314 SDR families, encompassing about 31,900 members. In addition, about 9700 SDR forms belong to families with too few members at present to establish valid HMMs. In the human genome, we find 47 SDR families, corresponding to 82 genes. Thirteen families are present in all three domains (Eukaryota, Bacteria, and Archaea), and are hence expected to catalyze fundamental metabolic processes. The majority of these enzymes are of the 'extended' type, in agreement with earlier findings. About half of the SDR families are only found among bacteria, where the 'classical' SDR type is most prominent. The HMM-based classification is used as a basis for a sustainable and expandable nomenclature system.