Tuning Heterogeneous Computing Platforms for Large-Scale Hydrology Data Management

HydroTerre is a research prototype platform developed at Penn State for the hydrology community. It provides access to aggregated scientific data sets that are useful for hydrological modeling and research. HydroTerre's frontend is a web service, and a user query can request creation of a data bundle whose size can vary from a few megabytes to 100's of gigabytes. In this article, we present software tuning and optimization strategies for various hardware configurations of the HydroTerre platform. Our goal is to minimize access time to a wide range of data bundle creation queries from users. We use automated schemes to estimate the computational work required for various queries, and identify the best-performing hardware/software configuration. We hope this study is instructive for researchers developing similar data management cyberinfrastructure in other science and engineering fields.

[1]  Cecelia DeLuca,et al.  Toward self-describing and workflow integrated Earth system models: A coupled atmosphere-ocean modeling system application , 2013, Environ. Model. Softw..

[2]  Christopher J. Duffy,et al.  Essential Terrestrial Variable data workflows for distributed water resources modeling , 2013, Environ. Model. Softw..

[3]  Russell S. Vose,et al.  The Definition of the Standard WMO Climate Normal: The Key to Deriving Alternative Climate Normals , 2011 .

[4]  Christopher J. Duffy,et al.  Automating data-model workflows at a level 12 HUC scale: Watershed modeling in a distributed computing environment , 2014, Environ. Model. Softw..

[5]  Lifeng Luo,et al.  North American Land Data Assimilation System: A Framework for Merging Model and Satellite Data for Improved Drought Monitoring , 2012 .

[6]  Jeffery S. Horsburgh,et al.  Development of a Community Hydrologic Information System , 2009 .

[7]  Laura Díaz,et al.  Service-oriented applications for environmental models: Reusable geospatial services , 2010, Environ. Model. Softw..

[8]  Jeffery S. Horsburgh,et al.  A first approach to web services for the National Water Information System , 2008, Environ. Model. Softw..

[9]  Gregory G. Leptoukh,et al.  Online analysis enhances use of NASA Earth science data , 2007 .

[10]  C. Lynnes,et al.  Giovanni: The Bridge between Data and Science , 2012 .

[11]  Anthony M. Castronova,et al.  Modeling water resource systems using a service-oriented computing paradigm , 2011, Environ. Model. Softw..

[12]  Michael J. Oimoen,et al.  The National Elevation Dataset , 2002 .