Semantics-Enhanced Geoscience Interoperability, Analytics, and Applications

We present our research ideas for developing cyberinfrastructure for Geoscience applications developed in the context of the EarthCube initiative, and our NSF-sponsored work on incorporating spatial-temporal-thematic semantics for enhanced querying and feature extraction from sensor data streams. (1) Semantics-empowered Cyberinfrastructure for Geoscience applications Rapidly maturing semantic technologies, based in part on Semantic Web standards, have the potential to increase opportunities for interdisciplinary research by providing support and incentives for sharing, publishing, accessing and discovering heterogeneous data. Our thesis is that associating machine-processable lightweight semantics with the long tail of science data can overcome challenges associated with data discovery, integration and interoperability caused by data heterogeneity. In order to demonstrate this, we propose to develop cyberinfrastructure (CI) utilizing lightweight semantic capabilities to serve individual researchers. Specifically, the focus is on ease of use, low upfront cost, and shallow semantics that appeals to, and is most likely to be used by the broad community of geoscientists. The choice of using controlled vocabularies and lightweight ontologies, as compared with using formal ontologies in OWL, reduces complexities and training efforts, enabling wider and faster adoption by scientists not skilled in computer science techniques. We propose to use existing, community-ratified and enhanced ontologies that scientists can employ with minimal training to easily annotate (tag) their data, publish it, and discover relevant data in support of scientific discoveries. Coarse-grained annotations can facilitate semantic search, while fine-grained annotations and extraction can be used to create Linked Open Datasets (LOD). Using LOD, that is increasingly being adopted by open government and open science initiatives, data can be translated to a form that makes it readily available, reusable, and amenable to automatic processing, while supporting conceptual richness of data representation. Our research is aligned with National Science Foundation’s EarthCube initiative. (2) Expressive search and integration using Geospatial information We have developed expressive extensions to RDF and SPARQL that associate spatiotemporal information with triples via annotations and employ rich operators to support inferencing. This framework extended using geospatial knowledge to support spatial semantics can support interoperability and complex analysis. In the context of Semantic Sensor Web, to process multimodal sensor data streams, we have used spatio-temporal context in Semantic Sensor Observation Service (SemSOS) to aggregate and combine primitive weather sensor data to obtain weather features, and exploit Geonames portion of LOD to map place names to GPS coordinates, to locate relevant sensors and to provide easy-to-use and natural query interfaces.