Accelerated discovery through integration of Kepler with data turbine for ecosystem research

There is a need for Accelerated Discovery Cycles (ADCs) for integrating experimental and observational data to capture large-scale dynamic ecosystem complexity, to instantly process massive datasets, to test contrasting mechanistic models and to drive the next set of experiments. The overreaching objective is to enable ADCs by coupling recent advances in computational models and cyber-systems with the unique experimental infrastructure of Biosphere 2 (B2), a large-scale earth system science facility now under management by the University of Arizona. In the context of ADCs, there is a need for software development environment for modeling complex systems and a middleware for data streaming from the field into the models. Kepler is an open source tool that enables the end user to design scientific workflows in order to manage scientific data and perform complex analysis on the data. Ring Buffered Network Bus (RBNB) Data Turbine is a middleware system that is used to integrate sensor-based environment observing systems with data processing systems. Currently the integration between Kepler and Data Turbine is limited to reading from the Data Turbine only. In ADC, multiple hypotheses are tested with different assimilation models. These models run on a distributed computing environment, therefore capability of simultaneous reads and writes to the Data Turbine is a necessity. In this paper we show how to integrate Kepler with RBNB Data Turbine to achieve this capability. We also exploit the open-source features of Kepler system and create customized processing models in order to accelerate and automate the experiments in ecosystems research. We describe in further details our implementation approach to enable future studies on Kepler and Data Turbine integration.

[1]  Kwang-Tsao Shao,et al.  Data Management at Kenting's Underwater Ecological Observatory , 2007, 2007 3rd International Conference on Intelligent Sensors, Sensor Networks and Information.

[2]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[3]  Bertram Ludäscher,et al.  Kepler: an extensible system for design and execution of scientific workflows , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[4]  Bertram Ludäscher,et al.  Actor-Oriented Design of Scientific Workflows , 2005, ER.

[5]  Tony Fountain,et al.  The Ring Buffer Network Bus (RBNB) DataTurbine Streaming Data Middleware for Environmental Observing Systems , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[6]  Bertram Ludäscher,et al.  Collection-Oriented Scientific Workflows for Integrating and Analyzing Biological Data , 2006, DILS.