论文信息 - BOD: a customizable bioinformatics on demand system accommodating multiple steps and parallel tasks.

BOD: a customizable bioinformatics on demand system accommodating multiple steps and parallel tasks.

The integration of bioinformatics resources worldwide is one of the major concerns of the biological community. We herein established the BOD (Bioinformatics on demand) system to use Grid computing technology to set up a virtual workbench via a web-based platform, to assist researchers performing customized comprehensive bioinformatics work. Users will be able to submit entire search queries and computation requests, e.g. from DNA assembly to gene prediction and finally protein folding, from their own office using the BOD end-user web interface. The BOD web portal parses the user's job requests into steps, each of which may contain multiple tasks in parallel. The BOD task scheduler takes an entire task, or splits it into multiple subtasks, and dispatches the task or subtasks proportionally to computation node(s) associated with the BOD portal server. A node may further split and distribute an assigned task to its sub-nodes using a similar strategy. In the end, the BOD portal server receives and collates all results and returns them to the user. BOD uses a pipeline model to describe the user's submitted data and stores the job requests/status/results in a relational database. In addition, an XML criterion is established to capture task computation program details.

[1] Carole A. Goble,et al. myGrid: personalised bioinformatics on the information grid , 2003, ISMB.

[2] Jill P. Mesirov,et al. Computational Biology , 2018, Encyclopedia of Parallel Computing.

[3] L. Stein. Creating a bioinformatics nation , 2002, Nature.

[4] Owen White,et al. TIGR Assembler: A New Tool for Assembling Large Shotgun Sequencing Projects , 1995 .

[5] Shawn Hoon,et al. Biopipe: a flexible framework for protocol-based bioinformatics analysis. , 2003, Genome research.

[6] Ami Marowka,et al. The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[7] S. Salzberg,et al. Improved microbial gene identification with GLIMMER. , 1999, Nucleic acids research.

[8] Roderic Guigó,et al. Assembling Genes from Predicted Exons In Linear Time with Dynamic Programming , 1998, J. Comput. Biol..

[9] David T. Jones. THREADER : Protein Sequence Threading by Double Dynamic Programming , 1998 .

[10] S. Karlin,et al. Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[11] Peter Buneman,et al. Challenges in Integrating Biological Data Sources , 1995, J. Comput. Biol..

[12] Marina Chicurel,et al. Bioinformatics: Bringing it all together technology feature , 2002, Nature.

[13] Ian Foster,et al. The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[14] D. Roos,et al. Bioinformatics--Trying to Swim in a Sea of Data , 2001, Science.

[15] I. Longden,et al. EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[16] Mark D. Wilkinson,et al. BioMOBY: An Open Source Biological Web Services Proposal , 2002, Briefings Bioinform..

[17] Marina Chicurel. Bioinformatics: bringing it all together. , 2002, Nature.

[18] Steven Tuecke,et al. The Physiology of the Grid An Open Grid Services Architecture for Distributed Systems Integration , 2002 .