MiMiR – an integrated platform for microarray data sharing, mining and analysis

BackgroundDespite considerable efforts within the microarray community for standardising data format, content and description, microarray technologies present major challenges in managing, sharing, analysing and re-using the large amount of data generated locally or internationally. Additionally, it is recognised that inconsistent and low quality experimental annotation in public data repositories significantly compromises the re-use of microarray data for meta-analysis. MiMiR, the Mi croarray data Mi ning R esource was designed to tackle some of these limitations and challenges. Here we present new software components and enhancements to the original infrastructure that increase accessibility, utility and opportunities for large scale mining of experimental and clinical data.ResultsA user friendly Online Annotation Tool allows researchers to submit detailed experimental information via the web at the time of data generation rather than at the time of publication. This ensures the easy access and high accuracy of meta-data collected. Experiments are programmatically built in the MiMiR database from the submitted information and details are systematically curated and further annotated by a team of trained annotators using a new Curation and Annotation Tool. Clinical information can be annotated and coded with a clinical Data Mapping Tool within an appropriate ethical framework. Users can visualise experimental annotation, assess data quality, download and share data via a web-based experiment browser called MiMiR Online. All requests to access data in MiMiR are routed through a sophisticated middleware security layer thereby allowing secure data access and sharing amongst MiMiR registered users prior to publication. Data in MiMiR can be mined and analysed using the integrated EMAAS open source analysis web portal or via export of data and meta-data into Rosetta Resolver data analysis package.ConclusionThe new MiMiR suite of software enables systematic and effective capture of extensive experimental and clinical information with the highest MIAME score, and secure data sharing prior to publication. MiMiR currently contains more than 150 experiments corresponding to over 3000 hybridisations and supports the Microarray Centre's large microarray user community and two international consortia. The MiMiR flexible and scalable hardware and software architecture enables secure warehousing of thousands of datasets, including clinical studies, from microarray and potentially other -omics technologies.

[1]  Chris F. Taylor,et al.  The MGED Ontology: a resource for semantics-based description of microarray experiments , 2006, Bioinform..

[2]  A. McGuire,et al.  Research ethics and the challenge of whole-genome sequencing , 2008, Nature Reviews Genetics.

[3]  David Botstein,et al.  The Stanford Microarray Database: data access and quality assessment tools , 2003, Nucleic Acids Res..

[4]  Ola Larsson,et al.  Lack of correct data format and comparability limits future integrative microarray research , 2006, Nature Biotechnology.

[5]  Ola Spjuth,et al.  The LCB Data Warehouse , 2006, Bioinform..

[6]  Helen E. Parkinson,et al.  ArrayExpress—a public database of microarray experiments and gene expression profiles , 2006, Nucleic Acids Res..

[7]  Gavin Sherlock,et al.  The Stanford Microarray Database: implementation of new analysis tools and open source release of software , 2002, Nucleic Acids Res..

[8]  Stefan Michiels,et al.  Prediction of cancer outcome with microarrays: a multiple random validation strategy , 2005, The Lancet.

[9]  Nicola Cooley,et al.  MiMiR: a comprehensive solution for storage, annotation and exchange of microarray data , 2005, BMC Bioinformatics.

[10]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[11]  Peter White,et al.  EPConDB: a web resource for gene expression related to pancreatic development, beta-cell function and diabetes , 2006, Nucleic Acids Res..

[12]  Sorin Draghici,et al.  KUTE-BASE: storing, downloading and exporting MIAME-compliant microarray experiments in minutes rather than hours , 2008, Bioinform..

[13]  Janan T. Eppig,et al.  The Mouse Gene Expression Database (GXD) , 2001, Nucleic Acids Res..

[14]  Giancarlo Mauri,et al.  The Genopolis Microarray Database , 2007, BMC Bioinformatics.

[15]  M. Vijver,et al.  Technology Insight: tuning into the genetic orchestra using microarrays—limitations of DNA microarrays in clinical practice , 2006, Nature Clinical Practice Oncology.

[16]  Eric W. Deutsch,et al.  SBEAMS-Microarray: database software supporting genomic expression analyses for systems biology , 2006, BMC Bioinformatics.

[17]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[18]  Mildred K. Cho,et al.  The Future of Personal Genomics , 2007, Science.

[19]  Janan T. Eppig,et al.  The mouse Gene Expression Database (GXD): 2017 update , 2016, Nucleic Acids Res..

[20]  S. Gruvberger,et al.  BioArray Software Environment (BASE): a platform for comprehensive management and analysis of microarray data , 2002, Genome Biology.

[21]  S. Nelson,et al.  Celsius: a community resource for Affymetrix microarray data , 2007, Genome Biology.

[22]  Paul T. Spellman,et al.  A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB , 2006, BMC Bioinformatics.

[23]  Evelyn Strauss,et al.  Arrays of Hope , 2006, Cell.

[24]  David Liu,et al.  DAVID Knowledgebase: a gene-centered database integrating heterogeneous gene annotation resources to facilitate high-throughput gene functional analysis , 2007, BMC Bioinformatics.

[25]  John Darlington,et al.  EMAAS: An extensible grid-based Rich Internet Application for microarray data analysis and management , 2008, BMC Bioinformatics.

[26]  Timothy Caulfield,et al.  Medicine. The future of personal genomics. , 2007, Science.

[27]  Dennis B. Troup,et al.  NCBI GEO: mining tens of millions of expression profiles—database and tools update , 2006, Nucleic Acids Res..

[28]  Pascal Barbry,et al.  Mediante: a web-based microarray data manager , 2007, Bioinform..

[29]  Tsviya Olender,et al.  Human Gene-Centric Databases at the Weizmann Institute of Science: GeneCards, UDB, CroW 21 and HORDE , 2003, Nucleic Acids Res..

[30]  Helen Parkinson,et al.  The MGED Ontology: A Framework for Describing Functional Genomics Experiments , 2003, Comparative and functional genomics.

[31]  Kiran Kamath,et al.  Gene Aging Nexus: a web database and data mining platform for microarray data on aging , 2006, Nucleic Acids Res..

[32]  Jason E. Stewart,et al.  Design and implementation of microarray gene expression markup language (MAGE-ML) , 2002, Genome Biology.