The GEON portal: accelerating knowledge discovery in the geosciences

Geoscience studies produce data from various observations, experiments, and simulations at an enormous rate. With proliferation of applications and data formats, the geoscience research community faces many challenges in effectively managing and sharing resources and in efficiently integrating and analyzing the data. In this paper, we discuss how this challenge is being addressed by the GEON Portal, a Web based distributed resource management system that provides integrated access to data and tools needed for knowledge discovery in the geosciences. Unlike previous data management efforts that were either data-driven or application-driven, the GEON Portal provides facilities for efficient sharing, discovery and integration of both data and services that use geoscience data. We identify the challenges involved in managing geoscientific resources and provide solutions that exploit the syntactic, semantic, temporal and spatial metadata associated with the resources. One of our goals is to provide some insight into the challenges involved in providing a comprehensive scientific data management solution based on our experiences with geoscientific data.

[1]  Louiqa Raschid,et al.  Semantic query optimization for object databases , 1997, Proceedings 13th International Conference on Data Engineering.

[2]  Kai Lin,et al.  A System for Semantic Integration of Geologic Maps via Ontologies ∗ , 2003 .

[3]  Subbarao Kambhampati,et al.  Answering Imprecise Queries over Autonomous Web Databases , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[4]  Alberto O. Mendelzon,et al.  Database techniques for the World-Wide Web: a survey , 1998, SGMD.

[5]  Carole A. Goble,et al.  Query processing in the TAMBIS bioinformatics source integration system , 1999, Proceedings. Eleventh International Conference on Scientific and Statistical Database Management.

[6]  Peter M. Schwarz,et al.  DiscoveryLink , 2003, Bioinformatics.

[7]  A. Krishna Sinha,et al.  Geoinformatics : data to knowledge , 2006 .

[8]  Bertram Ludäscher,et al.  Managing scientific data: From data integration to scientific workflows* , 2006 .

[9]  FlorescuDaniela,et al.  Database techniques for the World-Wide Web , 1998 .

[10]  Laura M. Haas,et al.  Optimizing Queries Across Diverse Data Sources , 1997, VLDB.

[11]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[12]  Laks V. S. Lakshmanan,et al.  On semantic query optimization in deductive databases , 1992, [1992] Eighth International Conference on Data Engineering.

[13]  Ilya Zaslavsky,et al.  Generating composite thematic maps from semantically-different collections of shapefiles and map services , 2005 .

[14]  Zoé Lacroix,et al.  The biological integration system , 2003, WIDM '03.

[15]  Eduardo Mena Nieto Observer: an approach for query processing in global information systems based on interoperation across pre-existing ontologies , 1999 .

[16]  Ullas Nambiar,et al.  GEONSearch: From Searching to Recommending , 2006 .

[17]  Gregory Piatetsky-Shapiro,et al.  Advances in Knowledge Discovery and Data Mining , 2004, Lecture Notes in Computer Science.

[18]  Vipul Kashyap,et al.  Observer: an approach for query processing in global information systems based on interoperation across pre-existing ontologies , 1996, Proceedings First IFCIS International Conference on Cooperative Information Systems.

[19]  Yolanda Gil,et al.  Artemis: Integrating Scientific Data on the Grid , 2004, AAAI.

[20]  Bertram Ludäscher,et al.  Managing scientific data: From data integration to scientific workflows* , 2006 .

[21]  Subbarao Kambhampati,et al.  Mining coverage statistics for websource selection in a mediator , 2002, CIKM '02.

[22]  Bertram Ludäscher,et al.  Registering Scientific Information Sources for Semantic Mediation , 2002, ER.

[23]  Jennifer Widom,et al.  The TSIMMIS Project: Integration of Heterogeneous Information Sources , 1994, IPSJ.

[24]  Subbarao Kambhampati,et al.  Optimizing Recursive Information Gathering Plans in EMERAC , 2004, Journal of Intelligent Information Systems.

[25]  Bertram Ludäscher,et al.  A Scientific Workflow Approach to Distributed Geospatial Data Processing using Web Services , 2005, SSDBM.

[26]  Bertram Ludäscher,et al.  A Calculus for Propagating Semantic Annotations Through Scientific Workflow Queries , 2006, EDBT Workshops.

[27]  S. Griffis EDITOR , 1997, Journal of Navigation.