BioMart: a data federation framework for large collaborative projects

BioMart is a freely available, open source, federated database system that provides a unified access to disparate, geographically distributed data sources. It is designed to be data agnostic and platform independent, such that existing databases can easily be incorporated into the BioMart framework. BioMart allows databases hosted on different servers to be presented seamlessly to users, facilitating collaborative projects between different research groups. BioMart contains several levels of query optimization to efficiently manage large data sets and offers a diverse selection of graphical user interfaces and application programming interfaces to ensure that queries can be performed in whatever manner is most convenient for the user. The software has now been adopted by a large number of different biological databases spanning a wide range of data types and providing a rich source of annotation available to bioinformaticians and biologists alike. Database URL: http://www.biomart.org

[1]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[2]  Syed Haider,et al.  Ensembl BioMarts: a hub for data retrieval across taxonomic space , 2011, Database J. Biol. Databases Curation.

[3]  Kimberly Van Auken,et al.  WormBase: a comprehensive resource for nematode research , 2009, Nucleic Acids Res..

[4]  Rosalind J. Cutts,et al.  Using BioMart as a framework to manage and query pancreatic cancer data , 2011, Database J. Biol. Databases Curation.

[5]  Gunes Gundem,et al.  Integrative Cancer Genomics (IntOGen) in Biomart , 2011, Database J. Biol. Databases Curation.

[6]  Jennifer A. Siepen,et al.  PepSeeker: mining information from proteomic data. , 2008, Methods in molecular biology.

[7]  Syed Haider,et al.  International Cancer Genome Consortium Data Portal—a one-stop shop for cancer genomics data , 2011, Database J. Biol. Databases Curation.

[8]  Frédéric Chalmel,et al.  GermOnline 4.0 is a genomics gateway for germline development, meiosis and the mitotic cell cycle , 2010, Database J. Biol. Databases Curation.

[9]  Mingming Jia,et al.  Data mining using the Catalogue of Somatic Mutations in Cancer BioMart , 2011, Database J. Biol. Databases Curation.

[10]  Olivier Arnaiz,et al.  ParameciumDB in 2011: new tools and new data for functional and comparative genomics of the model ciliate Paramecium tetraurelia , 2010, Nucleic Acids Res..

[11]  David Shaw Searching the Mouse Genome Informatics (MGI) Resources for Information on Mouse Biology from Genotype to Phenotype , 2004, Current protocols in bioinformatics.

[12]  T. Hampton,et al.  The Cancer Genome Atlas , 2020, Indian Journal of Medical and Paediatric Oncology.

[13]  David Haussler,et al.  The UCSC genome browser database: update 2007 , 2006, Nucleic Acids Res..

[14]  Lennart Martens,et al.  The Proteomics Identifications database: 2010 update , 2009, Nucleic Acids Res..

[15]  Baris E. Suzek,et al.  The Universal Protein Resource (UniProt) in 2010 , 2009, Nucleic Acids Res..

[16]  Gary D Bader,et al.  International network of cancer genome projects , 2010, Nature.

[17]  Mathew W. Wright,et al.  The HUGO Gene Nomenclature Committee (HGNC) , 2001, Human Genetics.

[18]  E. Birney,et al.  EnsMart: a generic system for fast and flexible access to biological data. , 2003, Genome research.

[19]  María Martín,et al.  The Universal Protein Resource (UniProt) in 2010 , 2010 .

[20]  William Spooner,et al.  GrameneMart: the BioMart data portal for the Gramene project , 2012, Database J. Biol. Databases Curation.

[21]  Andrew M. Jenkinson,et al.  Ensembl 2009 , 2008, Nucleic Acids Res..

[22]  Christophe Klopp,et al.  SigReannot-mart: a query environment for expression microarray probe re-annotations , 2011, Database J. Biol. Databases Curation.

[23]  Mary Goldman,et al.  The UCSC Genome Browser database: update 2011 , 2010, Nucleic Acids Res..

[24]  Henning Hermjakob,et al.  The Reactome BioMart , 2011, Database J. Biol. Databases Curation.

[25]  Damian Smedley,et al.  BioMart Central Portal: an open database network for the biological community , 2011, Database J. Biol. Databases Curation.

[26]  Richard A. Baldock,et al.  The BioMart interface to the eMouseAtlas gene expression database EMAGE , 2011, Database J. Biol. Databases Curation.

[27]  A. Reymond,et al.  A High-Resolution Anatomical Atlas of the Transcriptome in the Mouse Embryo , 2011, PLoS biology.

[28]  Hideaki Sugawara,et al.  The Sequence Read Archive , 2010, Nucleic Acids Res..

[29]  Philip Jones,et al.  The InterPro BioMart: federated query and web service access to the InterPro Resource , 2011, Database J. Biol. Databases Curation.

[30]  D. Shaw,et al.  Searching the Mouse Genome Informatics (MGI) Resources for Information on Mouse Biology from Genotype to Phenotype , 2009, Current protocols in bioinformatics.