The InterPro BioMart: federated query and web service access to the InterPro Resource

The InterPro BioMart provides users with query-optimized access to predictions of family classification, protein domains and functional sites, based on a broad spectrum of integrated computational models (‘signatures’) that are generated by the InterPro member databases: Gene3D, HAMAP, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY and TIGRFAMs. These predictions are provided for all protein sequences from both the UniProt Knowledge Base and the UniParc protein sequence archive. The InterPro BioMart is supplementary to the primary InterPro web interface (http://www.ebi.ac.uk/interpro), providing a web service and the ability to build complex, custom queries that can efficiently return thousands of rows of data in a variety of formats. This article describes the information available from the InterPro BioMart and illustrates its utility with examples of how to build queries that return useful biological information. Database URL: http://www.ebi.ac.uk/interpro/biomart/martview.

[1]  Peer Bork,et al.  SMART 6: recent updates and new developments , 2008, Nucleic Acids Res..

[2]  Elisabeth Coudert,et al.  HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot , 2008, Nucleic Acids Res..

[3]  María Martín,et al.  Ongoing and future developments at the Universal Protein Resource , 2010, Nucleic Acids Res..

[4]  Teresa K. Attwood,et al.  The PRINTS protein fingerprint database: functional and evolutionary applications , 2004 .

[5]  Michelle G. Giglio,et al.  TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes , 2006, Nucleic Acids Res..

[6]  Hagen Blankenburg,et al.  Integrating biological data – the Distributed Annotation System , 2008, BMC Bioinformatics.

[7]  Cyrus Chothia,et al.  SUPERFAMILY—sophisticated comparative genomics, data mining, visualization and phylogeny , 2008, Nucleic Acids Res..

[8]  Christine A. Orengo,et al.  Gene3D: merging structure and function for a Thousand genomes , 2009, Nucleic Acids Res..

[9]  Daniel Rios,et al.  Ensembl 2011 , 2010, Nucleic Acids Res..

[10]  Lincoln Stein,et al.  Reactome: a database of reactions, pathways and biological processes , 2010, Nucleic Acids Res..

[11]  M. Campbell,et al.  PANTHER: a library of protein families and subfamilies indexed by function. , 2003, Genome research.

[12]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[13]  Henning Hermjakob,et al.  The Reactome BioMart , 2011, Database J. Biol. Databases Curation.

[14]  Damian Smedley,et al.  BioMart Central Portal: an open database network for the biological community , 2011, Database J. Biol. Databases Curation.

[15]  Damian Smedley,et al.  BioMart – biological queries made easy , 2009, BMC Genomics.

[16]  Junjun Zhang,et al.  BioMart: a data federation framework for large collaborative projects , 2011, Database J. Biol. Databases Curation.

[17]  Amos Bairoch,et al.  PROSITE, a protein domain database for functional characterization and annotation , 2009, Nucleic Acids Res..

[18]  Robert S. Ledley,et al.  PIRSF: family classification system at the Protein Information Resource , 2004, Nucleic Acids Res..

[19]  Jérôme Gouzy,et al.  ProDom: Automated Clustering of Homologous Domains , 2002, Briefings Bioinform..

[20]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[21]  Lennart Martens,et al.  The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases , 2007, BMC Bioinformatics.

[22]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..