On the development of an open and collaborative bioinformatics research environment

Abstract This paper reports on the development of a self-sustaining and community-responsive platform that streamlines the wealth of available open Bioinformatics resources to accelerate multi-disciplinary collaboration and boost innovation in post-genomics biomedical research. Our approach adopts the principles of reproducible, reusable and remixable computer-aided research, and builds on top of state-of-the-art concepts and converging technologies for simple, fast and scalable specification and execution of scientific workflows. The proposed platform enables innovative networking and community building among researchers, facilitates knowledge sharing and co-creation, assures better-informed collaboration, and expedites gaining of insights. Paying particular attention to the issues of data and research provenance and attribution, the platform integrates a set of innovative services for the management of research resources and competences. The overall approach ensures the interoperability of the abovementioned resources and services from a technical, conceptual and user interface point of view.

[1]  Eduard Ayguadé,et al.  Workflows for Science: a Challenge when Facing the Convergence of HPC and Big Data , 2017, Supercomput. Front. Innov..

[2]  Carole A. Goble,et al.  The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud , 2013, Nucleic Acids Res..

[3]  John Chilton,et al.  The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update , 2016, Nucleic Acids Res..

[4]  Peter Webster,et al.  Research Data Repositories: Review of Current Features, Gap Analysis, and Recommendations for Minimum Requirements , 2016 .

[5]  Ryan M. Layer,et al.  SpeedSeq: Ultra-fast personal genome analysis and interpretation , 2014, Nature Methods.

[6]  Ola Spjuth,et al.  Experiences with workflows for automating data-intensive bioinformatics , 2015, Biology Direct.

[7]  V. Marx Biology: The big challenges of big data , 2013, Nature.

[8]  M. Swertz,et al.  Molgenis-impute: imputation pipeline in a box , 2015, BMC Research Notes.

[9]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[10]  Yong Lu,et al.  A crowdsourcing approach for reusing and meta-analyzing gene expression data , 2016, Nature Biotechnology.

[11]  Torsten Reimer,et al.  Virtual Research Environment Collaborative Landscape Study , 2010 .

[12]  Sergio Contrino,et al.  InterMine: extensive web services for modern biology , 2014, Nucleic Acids Res..

[13]  G. Poste Bring on the biomarkers , 2011, Nature.

[14]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[15]  N. B. Anuar,et al.  The rise of "big data" on cloud computing: Review and open research issues , 2015, Inf. Syst..

[16]  L. Levin,et al.  Lost in translation: bumps in the road between bench and bedside. , 2010, JAMA.

[17]  Manolis Tzagarakis,et al.  Collaborative Mining and Interpretation of Large-Scale Data for Biomedical Research Insights , 2014, PloS one.

[18]  John N. Weinstein,et al.  PRADA: pipeline for RNA sequencing data analysis , 2014, Bioinform..

[19]  Rob Knight,et al.  Using QIIME to Analyze 16S rRNA Gene Sequences from Microbial Communities , 2011, Current protocols in bioinformatics.

[20]  Michael C. Frank,et al.  Estimating the reproducibility of psychological science , 2015, Science.

[21]  J. Ioannidis Why Most Published Research Findings Are False , 2019, CHANCE.

[22]  Alan Sill,et al.  The Design and Architecture of Microservices , 2016, IEEE Cloud Computing.

[23]  Melissa S. Anderson,et al.  Normative Dissonance in Science: Results from a National Survey of U.S. Scientists , 2007, Journal of empirical research on human research ethics : JERHRE.

[24]  Nikos Karacapilidis,et al.  Towards a Sustainable Solution for Collaborative Healthcare Research , 2016 .

[25]  Carole A. Goble,et al.  Structuring research methods and data with the research object model: genomics workflows as a case study , 2013, Journal of Biomedical Semantics.

[26]  Moustafa Ghanem,et al.  Tavaxy: Integrating Taverna and Galaxy workflows with cloud computing support , 2012, BMC Bioinformatics.