Near-Real-Time OGC Catalogue Service for Geoscience Big Data

Geoscience data are typically big data, and they are distributed in various agencies and individuals worldwide. Efficient data sharing and interoperability are important for managing and applying geoscience data. The OGC (Open Geospatial Consortium) Catalogue Service for the Web (CSW) is an open interoperability standard for supporting the discovery of geospatial data. In the past, regular OGC catalogue services have been studied, but few studies have discussed a near-real-time OGC catalogue service for geoscience big data. A near-real-time OGC catalogue service requires frequent updates of a metadata repository in a short time. When dealing with massive amounts of geoscience data, this comprises an extremely challenging issue. Discovering these data via an OGC catalogue service in near real-time is desirable. In this study, we focus on how the near-real-time OGC catalogue service is realized through several lightweight data structures, algorithms, and tools. We propose a framework of a near-real-time OGC catalogue service and discuss each element of the framework to which more attention should be paid when dealing with the massive amounts of real-time data, followed by a review of several methods that need to be considered in a near-real-time OGC CSW service. A case study on providing an OGC catalogue service to Unidata real-time data is presented to demonstrate how specific methods are utilized to deal with real-time data. The goal of this paper is to fill the gap in knowledge regarding an OGC catalogue service for geoscience big data, and it has realistic significance in facilitating a near-real-time OGC catalogue service.

[1]  A. Luthra What is an Echo , 2017 .

[2]  Andrea De Mauro,et al.  A formal definition of Big Data based on its essential features , 2016 .

[3]  Helmut Krcmar,et al.  Big Data , 2014, Wirtschaftsinf..

[4]  Liping Di,et al.  Federated Catalogue for Discovering Earth Observation Data Konzept für einen Zentralkatalog für Fernerkundungsdaten , 2013 .

[5]  Yuanzheng Shao,et al.  Federated Catalogue for Discovering Earth Observation Data , 2013 .

[6]  Alvise Benetazzo,et al.  Knowledge discovery in large model datasets in the marine environment: the THREDDS Data Server example , 2012 .

[7]  Liping Di,et al.  Providing access to satellite imagery through OGC catalog service interfaces in support of the Global Earth Observation System of Systems , 2011, Comput. Geosci..

[8]  Liping Di,et al.  Persistent WCS and CSW services of GOES data for GEOSS , 2010, 2010 IEEE International Geoscience and Remote Sensing Symposium.

[9]  Nengcheng Chen,et al.  Use of service middleware based on ECHO with CSW for discovery and registry of MODIS data , 2010, Geo spatial Inf. Sci..

[10]  Aijun Chen,et al.  Towards a Geospatial Catalogue Federation Service , 2007 .

[11]  Timothy C. Spangler,et al.  Internet Data Distribution – extending real-time data sharing throughout the Americas , 2006 .

[12]  Ben Domenico,et al.  Extending THREDDS middleware to serve OGC community , 2006 .

[13]  Aijun Chen,et al.  An Optimized Grid-Based, OGC Standards-Compliant Collaborative Software System for Serving NASA Geospatial Data , 2006, 2006 30th Annual IEEE/NASA Software Engineering Workshop.

[14]  Ben Domenico,et al.  Thematic Real-time Environmental Distributed Data Services (THREDDS): Incorporating Interactive Analysis Tools into NSDL , 2002, J. Digit. Inf..