Interactive Visualization for Large-Scale Multi-factorial Research Designs

Recent publications have shown that the majority of studies cannot be adequately reproduced. The underlying causes seem to be diverse. Usage of the wrong statistical tools can lead to the reporting of dubious correlations as significant results. Missing information from lab protocols or other metadata can make verification impossible. Especially with the advent of Big Data in the life sciences and the hereby-involved measurement of thousands of multi-omics samples, researchers depend more than ever on adequate metadata annotation. In recent years, the scientific community has created multiple experimental design standards, which try to define the minimum information necessary to make experiments reproducible. Tools help with creation or analysis of this abundance of metadata, but are often still based on spreadsheet formats and lack intuitive visualizations. We present an interactive graph visualization tailored to experiments using a factorial experimental design. Our solution summarizes sample sources and extracted samples based on similarity of independent variables, enabling a quick grasp of the scientific question at the core of the experiment even for large studies. We support the ISA-Tab standard, enabling visualization of diverse omics experiments. As part of our platform for data-driven biomedical research, our implementation offers additional features to detect the status of data generation and more.

[1]  Anne E. Trefethen,et al.  Toward interoperable bioscience data , 2012, Nature Genetics.

[2]  Susanna-Assunta Sansone,et al.  The open source ISA software suite and its international user community: knowledge management of experimental data , 2012 .

[3]  J. I The Design of Experiments , 1936, Nature.

[4]  D. Camerino,et al.  Estimating the Impact of Workplace Bullying: Humanistic and Economic Burden among Workers with Chronic Medical Conditions , 2015, BioMed research international.

[5]  Andreas Friedrich,et al.  qPortal: A platform for data-driven biomedical research , 2018, PloS one.

[6]  F. Collins,et al.  NIH plans to enhance reproducibility , 2014 .

[7]  Nicole A. Vasilevsky,et al.  On the reproducibility of science: unique identification of research resources in the biomedical literature , 2013, PeerJ.

[8]  Andreas Friedrich,et al.  Intuitive Web-Based Experimental Design for High-Throughput Biomedical Data , 2015, BioMed research international.

[9]  Jeffrey Heer,et al.  SpanningAspectRatioBank Easing FunctionS ArrayIn ColorIn Date Interpolator MatrixInterpola NumObjecPointI Rectang ISchedu Parallel Pause Scheduler Sequen Transition Transitioner Transiti Tween Co DelimGraphMLCon IData JSONCon DataField DataSc Dat DataSource Data DataUtil DirtySprite LineS RectSprite , 2011 .

[10]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[11]  Ulf-Dietrich Reips,et al.  WEXTOR: A Web-based tool for generating and visualizing experimental designs and procedures , 2002, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[12]  Susanna-Assunta Sansone,et al.  linkedISA: semantic representation of ISA-Tab experimental metadata , 2014, BMC Bioinformatics.

[13]  Jeffrey Heer,et al.  D³ Data-Driven Documents , 2011, IEEE Transactions on Visualization and Computer Graphics.

[14]  Jason E. Stewart,et al.  Design and implementation of microarray gene expression markup language (MAGE-ML) , 2002, Genome Biology.

[15]  Matej Oresic,et al.  Dynamics of Plasma Lipidome in Progression to Islet Autoimmunity and Type 1 Diabetes – Type 1 Diabetes Prediction and Prevention Study (DIPP) , 2018, Scientific Reports.

[16]  Nigel W. Hardy,et al.  The first RSBI (ISA-TAB) workshop: "can a simple format work for complex studies?". , 2008, Omics : a journal of integrative biology.

[17]  Lennart Martens,et al.  The minimum information about a proteomics experiment (MIAPE) , 2007, Nature Biotechnology.

[18]  Jürgen Cox,et al.  MaxQuant for in-depth analysis of large SILAC datasets. , 2014, Methods in molecular biology.

[19]  Alvis Brazma,et al.  Minimum Information About a Microarray Experiment (MIAME) – Successes, Failures, Challenges , 2009, TheScientificWorldJournal.

[20]  R. A. Fisher Introduction to " The Arrangement of Field Experiments " by , 2022 .

[21]  Susanna-Assunta Sansone,et al.  Bio-GraphIIn: a graph-based, integrative and semantically-enabled repository for life science experimental data , 2013 .

[22]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[23]  Christoph Steinbeck,et al.  MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data , 2012, Nucleic Acids Res..

[24]  F. Collins,et al.  Policy: NIH plans to enhance reproducibility , 2014, Nature.

[25]  Paul T. Spellman,et al.  A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB , 2006, BMC Bioinformatics.