CloudDRN: A Lightweight, End-to-End System for Sharing Distributed Research Data in the Cloud

The cloud has proven itself as a scalable platform for Web-based applications. However, scientists and medical researchers are still searching for a simple cloud-based architecture that enables secure collaboration and sharing of distributed datasets. To date, attempts at using the cloud for this purpose generally view the cloud as simply a pool of servers upon which to run their legacy software. This approach fails to leverage the unique platform capabilities of the cloud. In this paper, we describe our Cloud Distributed Research Network (CloudDRN). We leverage the cloud for availability, reliability, scalability, and improved security as compared to legacy distributed systems while still supporting site autonomy. Our philosophy is to adapt commercial software tooling that was originally designed for business use-cases, thereby benefiting from the large built-in user community. We describe our general architecture and show an example of our system created to share distributed clinical research data. We evaluate our system in Amazon Web Services (AWS) and in Microsoft Windows Azure and find that while each cloud achieves similar financial cost, representative queries are 3.5x slower on average in Windows Azure.

[1]  Joel H. Saltz,et al.  caGrid: design and implementation of the core architecture of the cancer biomedical informatics grid , 2006, Bioinform..

[2]  Richard Platt,et al.  Design of a National Distributed Health Data Network , 2009, Annals of Internal Medicine.

[3]  Bruce R. Rosen,et al.  Enabling collaborative research using the Biomedical Informatics Research Network (BIRN) , 2011, J. Am. Medical Informatics Assoc..

[4]  Philip R. O. Payne,et al.  TRIAD: The Translational Research Informatics and Data Management Grid , 2011, Applied Clinical Informatics.

[5]  Joel H. Saltz,et al.  The Cancer Biomedical Informatics Grid (caBIG™) Security Infrastructure , 2007, AMIA.

[6]  R. Platt,et al.  Distributed Health Data Networks: A Practical and Preferred Approach to Multi-Institutional Evaluations of Comparative Effectiveness, Safety, and Quality of Care , 2010, Medical care.

[7]  J. Robert Beck,et al.  The Cancer Biomedical Informatics Grid (caBIG‚): An Evolving Community for Cancer Research , 2010 .

[8]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[9]  Jie Li,et al.  Publication and consumption of caBIG data services using .NET , 2010, Concurr. Comput. Pract. Exp..

[10]  Hallie McClung Workman Formation of safe spaces in gendered online communities [electronic resource] : reddit and "the front page of the internet" / , 2014 .

[11]  Joel H. Saltz,et al.  caGrid 1.0: A Grid Enterprise Architecture for Cancer Research , 2007, AMIA.