BioSciCumulus: um portal para análise de dados de proveniência em workflows de biologia computacional

The management of scientific experiments has been supported by Scientific Workflow Systems (SWS). However, result data analysis still presents difficulties due to the volume and heterogeneity of data generated. To assist in the experiment analysis, SWS capture provenance data that track workflow execution data. Nevertheless, the analysis task may be not simple since it requires user expertise in query languages and the modeling of the provenance data to carry out the analysis. To support these issues, this paper proposes the BioSciCumulus Portal to facilitate scientific workflow submission in the bioinformatics domain in High Performance Computing (HPC) environments and data analysis, without the need for the user to configure the HPC environment or to specify their analyses via query language syntax.

[1]  Ewa Deelman,et al.  HUBzero and Pegasus: integrating scientific workflows into science gateways , 2015, Concurr. Comput. Pract. Exp..

[2]  Marta Mattoso,et al.  Analyzing related raw data files through dataflows , 2016, Concurr. Comput. Pract. Exp..

[3]  Marta Mattoso,et al.  Dynamic steering of HPC scientific workflows: A survey , 2015, Future Gener. Comput. Syst..

[4]  Richard Grunzke,et al.  Gathering requirements for advancing simulations in HPC infrastructures via science gateways , 2017, Future Gener. Comput. Syst..

[5]  Adriane Chapman,et al.  Making database systems usable , 2007, SIGMOD '07.

[6]  Marta Mattoso,et al.  Exploring Molecular Evolution Reconstruction Using a Parallel Cloud Based Scientific Workflow , 2012, BSB.

[7]  Daniel S. Katz,et al.  Swift/T: Large-Scale Application Composition via Distributed-Memory Dataflow Processing , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[8]  Juliana Freire,et al.  Provenance and scientific workflows: challenges and opportunities , 2008, SIGMOD Conference.

[9]  Marta Mattoso,et al.  SciCumulus: A Lightweight Cloud Middleware to Explore Many Task Computing Paradigm in Scientific Workflows , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[10]  David Abramson,et al.  WorkWays: Interacting with Scientific Workflows , 2014, 2014 9th Gateway Computing Environments Workshop.

[11]  Fabiano L. Thompson,et al.  Diversity of Microbial Carbohydrate-Active enZYmes (CAZYmes) Associated with Freshwater and Soil Samples from Caatinga Biome , 2017, Microbial Ecology.

[12]  Sebastien Rey-Coyrehourcq,et al.  OpenMOLE, a workflow engine specifically tailored for the distributed exploration of simulation models , 2013, Future Gener. Comput. Syst..

[13]  R. Daniel,et al.  Metagenomic Analyses: Past and Future Trends , 2010, Applied and Environmental Microbiology.

[14]  Wei Chen,et al.  FireWorks: a dynamic workflow system designed for high‐throughput applications , 2015, Concurr. Comput. Pract. Exp..

[15]  Marta Mattoso,et al.  SciPhy: A Cloud-Based Workflow for Phylogenetic Analysis of Drug Targets in Protozoan Genomes , 2011, BSB.

[16]  Marta Mattoso,et al.  Optimizing Phylogenetic Analysis Using SciHmm Cloud-based Scientific Workflow , 2011, 2011 IEEE Seventh International Conference on eScience.

[17]  Moustafa Ghanem,et al.  Tavaxy: Integrating Taverna and Galaxy workflows with cloud computing support , 2012, BMC Bioinformatics.