metabolicMine: an integrated genomics, genetics and proteomics data warehouse for common metabolic disease research

Common metabolic and endocrine diseases such as diabetes affect millions of people worldwide and have a major health impact, frequently leading to complications and mortality. In a search for better prevention and treatment, there is ongoing research into the underlying molecular and genetic bases of these complex human diseases, as well as into the links with risk factors such as obesity. Although an increasing number of relevant genomic and proteomic data sets have become available, the quantity and diversity of the data make their efficient exploitation challenging. Here, we present metabolicMine, a data warehouse with a specific focus on the genomics, genetics and proteomics of common metabolic diseases. Developed in collaboration with leading UK metabolic disease groups, metabolicMine integrates data sets from a range of experiments and model organisms alongside tools for exploring them. The current version brings together information covering genes, proteins, orthologues, interactions, gene expression, pathways, ontologies, diseases, genome-wide association studies and single nucleotide polymorphisms. Although the emphasis is on human data, key data sets from mouse and rat are included. These are complemented by interoperation with the RatMine rat genomics database, with a corresponding mouse version under development by the Mouse Genome Informatics (MGI) group. The web interface contains a number of features including keyword search, a library of Search Forms, the QueryBuilder and list analysis tools. This provides researchers with many different ways to analyse, view and flexibly export data. Programming interfaces and automatic code generation in several languages are supported, and many of the features of the web interface are available through web services. The combination of diverse data sets integrated with analysis tools and a powerful query system makes metabolicMine a valuable research resource. The web interface makes it accessible to first-time users, whereas the Application Programming Interface (API) and web services provide convenient data access and tools for bioinformaticians. metabolicMine is freely available online at http://www.metabolicmine.org Database URL: http://www.metabolicmine.org

[1]  Sergio Contrino,et al.  InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data , 2012, Bioinform..

[2]  Paul T. Groth,et al.  The ENCODE (ENCyclopedia Of DNA Elements) Project , 2004, Science.

[3]  M. Vidal,et al.  Integrating 'omic' information: a bridge between genomics and systems biology. , 2003, Trends in genetics : TIG.

[4]  Manish Kumar,et al.  Hmrbase: a database of hormones and their receptors , 2009, BMC Genomics.

[5]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[6]  David S. Wishart,et al.  HMDB 3.0—The Human Metabolome Database in 2013 , 2012, Nucleic Acids Res..

[7]  Christian Gieger,et al.  Large-scale gene-centric meta-analysis across 39 studies identifies type 2 diabetes loci. , 2012, American journal of human genetics.

[8]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[9]  Mary Goldman,et al.  The UCSC Genome Browser database: update 2011 , 2010, Nucleic Acids Res..

[10]  Julie M. Sullivan,et al.  FlyMine: an integrated database for Drosophila and Anopheles genomics , 2007, Genome Biology.

[11]  J. Shaw,et al.  Global estimates of the prevalence of diabetes for 2010 and 2030. , 2010, Diabetes research and clinical practice.

[12]  Edith D. Wong,et al.  Saccharomyces Genome Database: the genomics resource of budding yeast , 2011, Nucleic Acids Res..

[13]  Felix R. Day,et al.  Developments in Obesity Genetics in the Era of Genome-Wide Association Studies , 2011, Lifestyle Genomics.

[14]  Christie S. Chang,et al.  The BioGRID interaction database: 2013 update , 2012, Nucleic Acids Res..

[15]  Eoin Fahy,et al.  LIPID MAPS online tools for lipid research , 2007, Nucleic Acids Res..

[16]  S. Agrawal,et al.  T2D-Db: An integrated platform to study the molecular basis of Type 2 diabetes , 2008, BMC Genomics.

[17]  Christoph Steinbeck,et al.  MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data , 2012, Nucleic Acids Res..

[18]  David Haussler,et al.  The UCSC Genome Browser database: update 2010 , 2009, Nucleic Acids Res..

[19]  E. Lundberg,et al.  Towards a knowledge-based Human Protein Atlas , 2010, Nature Biotechnology.

[20]  Junjun Zhang,et al.  BioMart: a data federation framework for large collaborative projects , 2011, Database J. Biol. Databases Curation.

[21]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[22]  S. Bryant,et al.  PubChem as a public resource for drug discovery. , 2010, Drug discovery today.

[23]  Christoph Steinbeck,et al.  Chemical Entities of Biological Interest: an update , 2009, Nucleic Acids Res..

[24]  Susumu Goto,et al.  KEGG for integration and interpretation of large-scale molecular data sets , 2011, Nucleic Acids Res..

[25]  K. Mossman The Wellcome Trust Case Control Consortium, U.K. , 2008 .

[26]  Ibrahim Emam,et al.  ArrayExpress update—an archive of microarray and high-throughput sequencing-based functional genomics experiments , 2010, Nucleic Acids Res..

[27]  Inês Barroso,et al.  Genome-wide association studies and type 2 diabetes. , 2011, Briefings in functional genomics.

[28]  Carol A. Bocchini,et al.  A new face and new challenges for Online Mendelian Inheritance in Man (OMIM®) , 2011, Human mutation.

[29]  Muin J Khoury,et al.  GWAS Integrator: a bioinformatics tool to explore human genetic associations reported in published genome-wide association studies , 2011, European Journal of Human Genetics.

[30]  Gos Micklem,et al.  InterMOD: integrated data and tools for the unification of model organism research , 2013, Scientific Reports.

[31]  Victoria Petri,et al.  RGD: A comparative genomics platform , 2010, Human Genomics.

[32]  Monica L. Mo,et al.  Global reconstruction of the human metabolic network based on genomic and bibliomic data , 2007, Proceedings of the National Academy of Sciences.

[33]  Priyanka Gupta,et al.  BioWarehouse: a bioinformatics database warehouse toolkit , 2006, BMC Bioinformatics.

[34]  P. Karp,et al.  Computational prediction of human metabolic pathways from the complete human genome , 2004, Genome Biology.

[35]  Judith A. Blake,et al.  The Mouse Genome Database: Genotypes, Phenotypes, and Models of Human Disease , 2012, Nucleic Acids Res..

[36]  Rafael C. Jimenez,et al.  The IntAct molecular interaction database in 2012 , 2011, Nucleic Acids Res..

[37]  Janet M Thornton,et al.  Genome and proteome annotation: organization, interpretation and integration , 2009, Journal of The Royal Society Interface.

[38]  Gregory Kaltsas,et al.  Metabolic syndrome: definitions and controversies , 2011, BMC medicine.

[39]  Ni Li,et al.  Gene Ontology Annotations and Resources , 2012, Nucleic Acids Res..

[40]  Gos Micklem,et al.  YeastMine—an integrated data warehouse for Saccharomyces cerevisiae data as a multipurpose tool-kit , 2012, Database J. Biol. Databases Curation.

[41]  A. Visel,et al.  Genomic Views of Distant-Acting Enhancers , 2009, Nature.

[42]  Bart De Moor,et al.  A guide to web tools to prioritize candidate genes , 2011, Briefings Bioinform..

[43]  Monte Westerfield,et al.  ZFIN: enhancements and updates to the zebrafish model organism database , 2010, Nucleic Acids Res..

[44]  M. I. McCarthy,et al.  Dorothy Hodgkin Lecture 2010 ^ . From hype to hope? A journey through the genetics of Type 2 diabetes , 2011, Diabetic medicine : a journal of the British Diabetic Association.

[45]  Kimberly Van Auken,et al.  WormBase 2012: more genomes, more data, new website , 2011, Nucleic Acids Res..

[46]  Subhashini Yaturu,et al.  Metabolic syndrome and cancer. , 2009, Metabolic syndrome and related disorders.

[47]  Ralf Hofestädt,et al.  RAMEDIS: a comprehensive information system for variations and corresponding phenotypes of rare metabolic diseases , 2010, Human mutation.

[48]  S. O’Rahilly,et al.  Human genetics illuminates the paths to metabolic disease , 2009, Nature.