Building gateways for life-science applications using the dynamic application runtime environment (DARE) framework

This work is predicated on three important trends: (i) that the importance, impact and percentage of TeraGrid/XD resources assigned to the life sciences is increasing at a rate that is probably greater than other disciplines, (ii) that gateways have proven to be a very effective access mechanism to distributed HPC resources provided by the TeraGrid/XD, and in particular a very successful model for shared/community access models, and (iii) that in spite of the previous two points there are missing capabilities and abstractions that enable the use of the collective capacity of distributed cyberinfrastructure such as TeraGrid/XD, especially those that can be used to develop gateways in an easy, extensible and scalable fashion for both compute and data-intensive applications. We introduce the SAGA-based, Dynamic Application Runtime Environment (DARE) framework from which extensible, versatile and effective gateways that seamlessly utilize scalable infrastructure can be built for a life-science applications. We discuss the architecture of DARE-based gateways, and four specific life-science gateways -- DARE-RFOLD, DARE-DOCK, DARE-HTHP and DARE-NGS, that use the DARE-framework to support a wide-range of life-science capabilities.

[1]  Suresh Marru,et al.  Open grid computing environments: advanced gateway support activities , 2010 .

[2]  E. Mardis The impact of next-generation sequencing technology on genetics. , 2008, Trends in genetics : TIG.

[3]  Wojciech Kasprzak,et al.  Bridging the gap in RNA structure prediction. , 2007, Current opinion in structural biology.

[4]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[5]  Shantenu Jha,et al.  Characterizing deep sequencing analytics using BFAST: towards a scalable distributed architecture for next-generation sequencing data , 2011, ECMLS '11.

[6]  Daniel S. Katz,et al.  Understanding Scientific Applications for Cloud Environments , 2011, CloudCom 2011.

[7]  R. Montange,et al.  Riboswitches: emerging themes in RNA structure and function. , 2008, Annual review of biophysics.

[8]  Shantenu Jha,et al.  Efficient Runtime Environment for Coupled Multi-physics Simulations: Dynamic Resource Allocation and Load-Balancing , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[9]  Bettina Schnor,et al.  Adaptive distributed replica–exchange simulations , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[10]  Shantenu Jha,et al.  Efficient large-scale replica-exchange simulations on production infrastructure , 2011, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[11]  Kohei Ichikawa,et al.  Design of a grid service-based platform for in silico protein-ligand screenings , 2009, Comput. Methods Programs Biomed..

[12]  Shantenu Jha,et al.  Energy landscape analysis for regulatory RNA finding using scalable distributed cyberinfrastructure , 2011, Concurr. Comput. Pract. Exp..

[13]  Shantenu Jha,et al.  SAGA BigJob: An Extensible and Interoperable Pilot-Job Abstraction for Distributed Applications and Systems , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[14]  David S. Goodsell,et al.  Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function , 1998 .

[15]  Shantenu Jha,et al.  Abstractions for Loosely-Coupled and Ensemble-Based Simulations on Azure , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[16]  Shantenu Jha,et al.  Exploring the RNA folding energy landscape using scalable distributed cyberinfrastructure , 2010, HPDC '10.

[17]  Alex Bateman,et al.  Bioinformatics for next generation sequencing. , 2009, Bioinformatics.

[18]  Susumu Date,et al.  Grid Heterogeneity in In-silico Experiments: An Exploration of Drug Screening Using DOCK on Cloud Environments , 2010, HealthGrid.

[19]  R. Breaker,et al.  The structural and functional diversity of metabolite-binding riboswitches. , 2009, Annual review of biochemistry.

[20]  Shantenu Jha,et al.  Developing Scientific Applications with Loosely-Coupled Sub-tasks , 2009, ICCS.

[21]  Stewart A. Adcock,et al.  Molecular dynamics: survey of methods for simulating the activity of proteins. , 2006, Chemical reviews.

[22]  R. Sorek,et al.  Prokaryotic transcriptomics: a new view on regulation, physiology and pathogenicity , 2010, Nature Reviews Genetics.