End-to-End eScience: Integrating Workflow, Query, Visualization, and Provenance at an Ocean Observatory

Data analysis tasks at an Ocean Observatory require integrative and and domain-specialized use of database, workflow, visualization systems. We describe a platform to support these tasks developed as part of the cyberinfrastructure at the NSF Science and Technology Center for Coastal Margin Observation and Prediction integrating a provenance-aware workflow system, 3D visualization, and a remote query engine for large-scale ocean circulation models. We show how these disparate tools complement each other and give examples of real scientific insights delivered by the integrated system. We conclude that data management solutions for eScience require this kind of holistic, integrative approach, explain how our approach may be generalized, and recommend a broader, application-oriented research agenda to explore relevant architectures.

[1]  David Maier,et al.  Algebraic Manipulation of Scientific Datasets , 2004, VLDB.

[2]  C.R. Johnson,et al.  SCIRun: A Scientific Programming Environment for Computational Steering , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[3]  Yong Zhao,et al.  Chimera: a virtual data system for representing, querying, and automating data derivation , 2002, Proceedings 14th International Conference on Scientific and Statistical Database Management.

[4]  Cláudio T. Silva,et al.  Provenance for Visualizations: Reproducibility and Beyond , 2007, Computing in Science & Engineering.

[5]  Cláudio T. Silva,et al.  VisTrails: enabling interactive multiple-view visualizations , 2005, VIS 05. IEEE Visualization, 2005..

[6]  David J. DeWitt,et al.  Scientific data management in the coming decade , 2005, SGMD.

[7]  Juliana Freire,et al.  Provenance and scientific workflows: challenges and opportunities , 2008, SIGMOD Conference.

[8]  Pat Hanrahan,et al.  Polaris: a system for query, analysis and visualization of multi-dimensional relational databases , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[9]  Edward D. Lazowska,et al.  Trident: Scientific Workflow Workbench for Oceanography , 2008, 2008 IEEE Congress on Services - Part I.

[10]  E. Wes Bethel,et al.  Chromium Renderserver: Scalable and Open Remote Rendering Infrastructure , 2008, IEEE Transactions on Visualization and Computer Graphics.

[11]  Cláudio T. Silva,et al.  Querying and Creating Visualizations by Analogy , 2007, IEEE Transactions on Visualization and Computer Graphics.

[12]  Cláudio T. Silva,et al.  Managing Rapidly-Evolving Scientific Workflows , 2006, IPAW.

[13]  A.M. Baptista CORIE: the first decade of a coastal-margin collaborative observatory , 2006, OCEANS 2006.

[14]  Bill Howe Gridfields: model-driven data transformation in the physical sciences , 2007 .

[15]  Edward D. Lazowska,et al.  COVE: a visual environment for ocean observatory design , 2008 .

[16]  Edward A. Lee,et al.  CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2000; 00:1–7 Prepared using cpeauth.cls [Version: 2002/09/19 v2.02] Taverna: Lessons in creating , 2022 .

[17]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[18]  Guntram Berti Generic software components for Scienti c Computing , 2000 .

[19]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[20]  Gerd Heber,et al.  Supporting Finite Element Analysis with a Relational Database Backend, Part I: There is Life beyond Files , 2007, ArXiv.