The complexity, challenges and benefits of comparing two transporter classification systems in TCDB and Pfam

Transport systems comprise roughly 10% of all proteins in a cell, playing critical roles in many processes. Improving and expanding their classification is an important goal that can affect studies ranging from comparative genomics to potential drug target searches. It is not surprising that different classification systems for transport proteins have arisen, be it within a specialized database, focused on this functional class of proteins, or as part of a broader classification system for all proteins. Two such databases are the Transporter Classification Database (TCDB) and the Protein family (Pfam) database. As part of a long-term endeavor to improve consistency between the two classification systems, we have compared transporter annotations in the two databases to understand the rationale for differences and to improve both systems. Differences sometimes reflect the fact that one database has a particular transporter family while the other does not. Differing family definitions and hierarchical organizations were reconciled, resulting in recognition of 69 Pfam ‘Domains of Unknown Function’, which proved to be transport protein families to be renamed using TCDB annotations. Of over 400 potential new Pfam families identified from TCDB, 10% have already been added to Pfam, and TCDB has created 60 new entries based on Pfam data. This work, for the first time, reveals the benefits of comprehensive database comparisons and explains the differences between Pfam and TCDB.

[1]  Robert Fredriksson,et al.  Functional specialization in nucleotide sugar transporters occurred through differentiation of the gene cluster EamA (DUF6) before the radiation of Viridiplantae , 2011, BMC Evolutionary Biology.

[2]  P. Brzezinski,et al.  Redox-driven membrane-bound proton pumps. , 2004, Trends in biochemical sciences.

[3]  Geoffrey Chang,et al.  X-ray structure of EmrE supports dual topology model , 2007, Proceedings of the National Academy of Sciences.

[4]  Milton H Saier,et al.  BioV Suite – a collection of programs for the study of transport protein evolution , 2012, The FEBS journal.

[5]  Bonnie A. Wallace,et al.  Structure and function of voltage-dependent ion channel regulatory β subunits , 2002 .

[6]  Melissa J. Landrum,et al.  RefSeq: an update on mammalian reference sequences , 2013, Nucleic Acids Res..

[7]  Yi-Hung Yeh,et al.  Structural characterizations of the chloroplast translocon protein Tic110 , 2013, The Plant journal : for cell and molecular biology.

[8]  Charles Elkan,et al.  The Transporter Classification Database: recent advances , 2008, Nucleic Acids Res..

[9]  Kimberly Van Auken,et al.  Recent advances in biocuration: Meeting Report from the fifth International Biocuration Conference , 2012, Database J. Biol. Databases Curation.

[10]  The UniProt Consortium,et al.  Update on activities at the Universal Protein Resource (UniProt) in 2013 , 2012, Nucleic Acids Res..

[11]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[12]  R. MacKinnon,et al.  Structure of a Voltage-Dependent K+ Channel β Subunit , 1999, Cell.

[13]  David Haussler,et al.  Current status and new features of the Consensus Coding Sequence database , 2013, Nucleic Acids Res..

[14]  Ramana Madupu,et al.  CharProtDB: a database of experimentally characterized protein annotations , 2011, Nucleic Acids Res..

[15]  Robert Fredriksson,et al.  Mapping the human membrane proteome : a majority of the human membrane proteins can be classified according to function and evolutionary origin , 2015 .

[16]  Marco Punta,et al.  An estimated 5% of new protein structures solved today represent a new Pfam family , 2013, Acta crystallographica. Section D, Biological crystallography.

[17]  J. Gulbis,et al.  Structure of a voltage-dependent K+ channel beta subunit. , 1999, Cell.

[18]  M H Saier,et al.  The drug/metabolite transporter superfamily. , 2001, European journal of biochemistry.

[19]  Michael A. Hicks,et al.  The Structure–Function Linkage Database , 2006, Nucleic Acids Res..

[20]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[21]  Jonathan S. Chen,et al.  The Amino Acid-Polyamine-Organocation Superfamily , 2012, Journal of Molecular Microbiology and Biotechnology.

[22]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[23]  Marco Punta,et al.  The Rough Guide to In Silico Function Prediction, or How To Use Sequence and Structure Information To Predict Protein Function , 2008, PLoS Comput. Biol..

[24]  Kevin O'Connor,et al.  The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam , 2010, Comput. Biol. Chem..

[25]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[26]  B. Wallace,et al.  Structure and function of voltage-dependent ion channel regulatory beta subunits. , 2002, Biochemistry.

[27]  David A. Lee,et al.  New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D structures , 2012, Nucleic Acids Res..

[28]  Simon Wollwage,et al.  Expansion of the APC superfamily of secondary carriers , 2014, Proteins.

[29]  A. Godzik,et al.  Exploration of Uncharted Regions of the Protein Universe , 2009, PLoS biology.

[30]  Tor Sandén,et al.  Localized proton microcircuits at the biological membrane–water interface , 2006, Proceedings of the National Academy of Sciences.

[31]  Milton H. Saier,et al.  TCDB: the Transporter Classification Database for membrane transport protein analyses and information , 2005, Nucleic Acids Res..

[32]  F. Palmieri The mitochondrial transporter family SLC25: identification, properties and physiopathology. , 2013, Molecular aspects of medicine.

[33]  Ferdinando Palmieri,et al.  The peroxisomal NAD+ carrier of Arabidopsis thaliana transports coenzyme A and its derivatives , 2012, Journal of Bioenergetics and Biomembranes.

[34]  María Martín,et al.  Activities at the Universal Protein Resource (UniProt) , 2013, Nucleic Acids Res..

[35]  Henry Chan,et al.  Pathways of transport protein evolution: recent advances , 2011, Biological chemistry.