A Language for Spatial Data Manipulation

Environmental Observation and Forecasting Systems (EOFS) create new opportunities and challenges for generation and use of environmental data products. The number and diversity of these data products, however, has been artificially constrained by the lack of a simple descriptive language for expressing them. Data products that can be described simply in English take pages of obtuse scripts to generate. The scripts obfuscate the original intent of the data product, making it difficult for users and scientists to understand the overall product catalog. The problem is exacerbated by the evolution of modern EOFS into data product “factories” subject to reliability requirements and daily production schedules. New products must be developed and assimilated into the product suite as quickly as they are imagined. Reliability must be maintained despite changing hardware, changing software, changing file formats, and changing staff. We present a language for naturally expressing data product recipes over structured and unstructured computational grids of arbitrary dimension. Informed by relational database theory, we have defined a simple data model and a handful of operators that can be composed to express complex visualizations, plots, and transformations of gridded datasets. The language provides a medium for design, discussion, and development of new data products without commitment to particular data structures or algorithms. In this paper, we provide a formal description of the language and several examples of its use to express and analyze data products. The context of our research is the CORIE system, an EOFS supporting the study of the Columbia River Estuary.

[1]  Andrew U. Frank,et al.  A Topological Data Model for Spatial Databases , 1990, SSD.

[2]  D. M. Butler,et al.  A visualization model based on the mathematics of fiber bundles , 1989 .

[3]  E. F. Codd,et al.  The Relational Model for Database Management, Version 2 , 1990 .

[4]  Bruce Lucas,et al.  A data model for scientific visualization with provisions for regular and irregular grids , 1991, Proceeding Visualization '91.

[5]  Steve Bryson,et al.  Vector-bundle classes form powerful tool for scientific visualization , 1992 .

[6]  Goetz Graefe,et al.  Algebraic Optimization of Computations over Scientific Databases , 1993, IEEE Data Eng. Bull..

[7]  David J. DeWitt,et al.  Client-Server Paradise , 1994, VLDB.

[8]  David Maier,et al.  Towards an effective calculus for object query languages , 1995, SIGMOD '95.

[9]  Limsoon Wong,et al.  A query language for multidimensional arrays: design, implementation, and optimization techniques , 1996, SIGMOD '96.

[10]  William E. Lorensen,et al.  The design and implementation of an object-oriented toolkit for 3D graphics and visualization , 1996, Proceedings of Seventh Annual IEEE Visualization '96.

[11]  Towards comprehensive database support for geoscientific raster data , 1997, GIS '97.

[12]  Kenneth Salem,et al.  A Language for Manipulating Arrays , 1997, VLDB.

[13]  Simon Peyton Jones,et al.  Lightweight Extensible Records for Haskell , 1999 .

[14]  Ralf Hinze,et al.  Haskell 98 — A Non−strict‚ Purely Functional Language , 1999 .

[15]  Lloyd A. Treinish A Function-Based Data Model for Visualization , 1999 .

[16]  António M. Baptista,et al.  Coastal and estuarine forecast systems. A multi-purpose infrastructure for the Columbia River , 1999 .

[17]  Marcelo Gattass,et al.  TerraLib: Technology in Support of GIS Innovation , 2000 .

[18]  F. E.,et al.  A Relational Model of Data Large Shared Data Banks , 2000 .

[19]  Calton Pu,et al.  Research challenges in environmental observation and forecasting systems , 2000, MobiCom '00.

[20]  Joel H. Saltz,et al.  Exploration and Visualization of Very Large Datasets with the Active Data Repository , 2001 .

[21]  Patrick J. Moran Field Model: An Object-Oriented Data Model for Fields , 2001 .

[22]  Yong Zhao,et al.  Chimera: a virtual data system for representing, querying, and automating data derivation , 2002, Proceedings 14th International Conference on Scientific and Statistical Database Management.