The PRINTS protein fingerprint database: functional and evolutionary applications

The PRINTS database is a compendium of protein fingerprints used to assign uncharacterized sequences to known families and hence to infer tentative functions. Fingerprints are groups of conserved sequence motifs that together provide diagnostic family signatures – the diagnostic potency of fingerprints derives both from the mutual context afforded by matching motif neighbours (a feature that cannot be exploited by single-motif approaches), and from the hierarchical nature of the family analysis, which highlights the often-subtle differences that constitute the functional determinants between closely-related families and their subfamilies. In addition to its use as a signature resource, and the development of an automatic supplement, prePRINTS, we describe here other features that distinguish PRINTS from, and render it complementary to, related protein family databases; we also outline its application to investigations of evolutionary relationships and functional site predictions in complex gene families, such as ion channels, G protein-coupled receptors and phosphatases. Keywords: multiple motifs; hierarchical sequence analysis; signature database; protein family characterisation; G protein-coupled receptors; phosphatases; functional annotation; ligand binding; allosteric sites