The BioExtract Server: a web-based bioinformatic workflow platform

The BioExtract Server (bioextract.org) is an open, web-based system designed to aid researchers in the analysis of genomic data by providing a platform for the creation of bioinformatic workflows. Scientific workflows are created within the system by recording tasks performed by the user. These tasks may include querying multiple, distributed data sources, saving query results as searchable data extracts, and executing local and web-accessible analytic tools. The series of recorded tasks can then be saved as a reproducible, sharable workflow available for subsequent execution with the original or modified inputs and parameter settings. Integrated data resources include interfaces to the National Center for Biotechnology Information (NCBI) nucleotide and protein databases, the European Molecular Biology Laboratory (EMBL-Bank) non-redundant nucleotide database, the Universal Protein Resource (UniProt), and the UniProt Reference Clusters (UniRef) database. The system offers access to numerous preinstalled, curated analytic tools and also provides researchers with the option of selecting computational tools from a large list of web services including the European Molecular Biology Open Software Suite (EMBOSS), BioMoby, and the Kyoto Encyclopedia of Genes and Genomes (KEGG). The system further allows users to integrate local command line tools residing on their own computers through a client-side Java applet.

[1]  Les Hatton,et al.  Empirical Test Observations in Client-Server Systems , 2007, Computer.

[2]  Geoffrey C. Fox,et al.  Examining the Challenges of Scientific Workflows , 2007, Computer.

[3]  Carole A. Goble,et al.  BioCatalogue: a universal catalogue of web services for the life sciences , 2010, Nucleic Acids Res..

[4]  Carole Goble,et al.  Curating Scientific Web Services and Workflows , 2008 .

[5]  Carole A. Goble,et al.  Taverna/myGrid: Aligning a Workflow System with the Life Sciences Community , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[6]  Carole A. Goble,et al.  myExperiment: a repository and social network for the sharing of bioinformatics workflows , 2010, Nucleic Acids Res..

[7]  Rodrigo Lopez,et al.  Multiple sequence alignment with the Clustal series of programs , 2003, Nucleic Acids Res..

[8]  J. Mesirov,et al.  GenePattern 2.0 , 2006, Nature Genetics.

[9]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[10]  Pierre Tufféry,et al.  BIOINFORMATICS ORIGINAL PAPER , 2022 .

[11]  Yogesh L. Simmhan,et al.  The Trident Scientific Workflow Workbench , 2008, 2008 IEEE Fourth International Conference on eScience.

[12]  Enno Ohlebusch,et al.  Replacing suffix trees with enhanced suffix arrays , 2004, J. Discrete Algorithms.

[13]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[14]  Volker Brendel,et al.  BioExtract Server—An Integrated Workflow-Enabling System to Access and Analyze Heterogeneous, Distributed Biomolecular Data , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  Carole A. Goble,et al.  Taverna: a tool for building and running workflows of services , 2006, Nucleic Acids Res..

[16]  Matthew Shields,et al.  WS-RF Workflow in Triana , 2008, Int. J. High Perform. Comput. Appl..

[17]  Volker Brendel,et al.  Multi-query sequence BLAST output examination with MuSeqBox , 2001, Bioinform..

[18]  Jill P Mesirov,et al.  Accessible Reproducible Research , 2010, Science.