Intuitive Web-Based Experimental Design for High-Throughput Biomedical Data

Big data bioinformatics aims at drawing biological conclusions from huge and complex biological datasets. Added value from the analysis of big data, however, is only possible if the data is accompanied by accurate metadata annotation. Particularly in high-throughput experiments intelligent approaches are needed to keep track of the experimental design, including the conditions that are studied as well as information that might be interesting for failure analysis or further experiments in the future. In addition to the management of this information, means for an integrated design and interfaces for structured data annotation are urgently needed by researchers. Here, we propose a factor-based experimental design approach that enables scientists to easily create large-scale experiments with the help of a web-based system. We present a novel implementation of a web-based interface allowing the collection of arbitrary metadata. To exchange and edit information we provide a spreadsheet-based, humanly readable format. Subsequently, sample sheets with identifiers and metainformation for data generation facilities can be created. Data files created after measurement of the samples can be uploaded to a datastore, where they are automatically linked to the previously created experimental design model.

[1]  Michael Kaufmann,et al.  yFiles - Visualization and Automatic Layout of Graphs , 2001, Graph Drawing Software.

[2]  Jason E. Stewart,et al.  Design and implementation of microarray gene expression markup language (MAGE-ML) , 2002, Genome Biology.

[3]  John Wilbanks,et al.  'Omics Data Sharing , 2009, Science.

[4]  Bernd Rinn,et al.  openBEB: open biological experiment browser for correlative measurements , 2012, BMC Bioinformatics.

[5]  C. Lynch Big data: How do your data grow? , 2008, Nature.

[6]  Bernd Rinn,et al.  openBIS: a flexible framework for managing and analyzing complex data in biology research , 2011, BMC Bioinformatics.

[7]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[8]  G A Whitmore,et al.  Power and sample size for DNA microarray studies , 2002, Statistics in medicine.

[9]  Paul T. Spellman,et al.  A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB , 2006, BMC Bioinformatics.

[10]  Christopher Gignoux,et al.  The 1000 Genomes Project: new opportunities for research and social challenges , 2010, Genome Medicine.

[11]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[12]  C. Rueden,et al.  Metadata matters: access to image data in the real world , 2010, The Journal of cell biology.

[13]  Hugo Y. K. Lam,et al.  Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes , 2012, Cell.

[14]  R. Wilson,et al.  The Next-Generation Sequencing Revolution and Its Impact on Genomics , 2013, Cell.

[15]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[16]  M. Watson,et al.  Illuminating the future of DNA sequencing , 2014, Genome Biology.

[17]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[18]  V. Marx Biology: The big challenges of big data , 2013, Nature.