A Knowledge-Driven Geospatially Enabled Framework for Geological Big Data

Geologic survey procedures accumulate large volumes of structured and unstructured data. Fully exploiting the knowledge and information that are included in geological big data and improving the accessibility of large volumes of data are important endeavors. In this paper, which is based on the architecture of the geological survey information cloud-computing platform (GSICCP) and big-data-related technologies, we split geologic unstructured data into fragments and extract multi-dimensional features via geological domain ontology. These fragments are reorganized into a NoSQL (Not Only SQL) database, and then associations between the fragments are added. A specific class of geological questions was analyzed and transformed into workflow tasks according to the predefined rules and associations between fragments to identify spatial information and unstructured content. We establish a knowledge-driven geologic survey information smart-service platform (GSISSP) based on previous work, and we detail a study case for our research. The study case shows that all the content that has known relationships or semantic associations can be mined with the assistance of multiple ontologies, thereby improving the accuracy and comprehensiveness of geological information discovery.

[1]  Silvana Trimi,et al.  Big-data applications in the government sector , 2014, Commun. ACM.

[2]  Vijay Kumar Verma,et al.  Text mining and information professionals: Role, issues and challenges , 2015, 2015 4th International Symposium on Emerging Trends and Technologies in Libraries and Information Services.

[3]  Philippe Cudré-Mauroux,et al.  DiploCloud: Efficient and Scalable Management of RDF Data in the Cloud , 2016, IEEE Transactions on Knowledge and Data Engineering.

[4]  Feng Zhang,et al.  Modeling and Discovering Data Services over SPARQL Services , 2014, 2014 IEEE World Congress on Services.

[5]  Mark A. Musen,et al.  User Extensible System to Identify Problems in OWL Ontologies and SWRL Rules , 2015, RuleML.

[6]  Ryosuke Shibasaki,et al.  The Design of Large Scale Data Management for Spatial Analysis on Mobile Phone Dataset , 2013 .

[7]  KimGang-Hoon,et al.  Big-data applications in the government sector , 2014 .

[8]  Václav Snásel,et al.  A Survey on Big Data, Mining: (Tools, Techniques, Applications and Notable Uses) , 2015, ECC.

[9]  Kai Liu,et al.  Forming a global monitoring mechanism and a spatiotemporal performance model for geospatial services , 2015, Int. J. Geogr. Inf. Sci..

[10]  Xiangang Luo,et al.  The Spatial Data Sharing Mechanisms of Geological Survey Information Grid in P2P Mixed Network Systems Network Architecture Model , 2010, 2010 Ninth International Conference on Grid and Cloud Computing.

[11]  Didier Donsez,et al.  CIRUS: an elastic cloud-based framework for Ubilytics , 2016, Ann. des Télécommunications.

[12]  Ricardo Colomo Palacios,et al.  Real-time business activity monitoring and analysis of process performance on big-data domains , 2016, Telematics Informatics.

[13]  Marcin Mazurek,et al.  Applying NoSQL Databases for Operationalizing Clinical Data Mining Models , 2014, BDAS.

[14]  Roy D. Sleator,et al.  'Big data', Hadoop and cloud computing in genomics , 2013, J. Biomed. Informatics.

[15]  Nengcheng Chen,et al.  A Semantic Registry Method Using Sensor Metadata Ontology to Manage Heterogeneous Sensor Information in the Geospatial Sensor Web , 2016, ISPRS Int. J. Geo Inf..

[16]  Chaowei Yang,et al.  Utilizing Cloud Computing to address big geospatial data challenges , 2017, Comput. Environ. Urban Syst..

[17]  Ralph Deters,et al.  Real-Time Effective Framework for Unstructured Data Mining , 2013, 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications.

[18]  Ralph Deters,et al.  Towards Knowledge Discovery in Big Data , 2014, 2014 IEEE 8th International Symposium on Service Oriented System Engineering.

[19]  H. Srimathi,et al.  Jena with SPARQL to Find Indian Natural Plants Used as Medicine for Diseases , 2014 .

[20]  Guihai Chen,et al.  Towards Parallel Spatial Query Processing for Big Spatial Data , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[21]  Domenico Cantone,et al.  Web Ontology Representation and Reasoning via Fragments of Set Theory , 2015, RR.

[22]  Krzysztof Janowicz,et al.  Enabling Semantic Search and Knowledge Discovery for ArcGIS Online: A Linked-Data-Driven Approach , 2015, AGILE Conf..

[23]  Ralph Deters,et al.  Unstructured data extraction in distributed NoSQL , 2013, 2013 7th IEEE International Conference on Digital Ecosystems and Technologies (DEST).

[24]  Rafael D. C. Santos,et al.  Automated geospatial Web Services composition based on geodata quality requirements , 2012, Comput. Geosci..

[25]  Jung-Hong Hong,et al.  Interoperable cross-domain semantic and geospatial framework for automatic change detection , 2016, Comput. Geosci..

[26]  Zhong Xie,et al.  A Geospatial Information Grid Framework for Geological Survey , 2015, PloS one.

[27]  Dongwon Jeong,et al.  SPARQL graph pattern rewriting for OWL-DL inference queries , 2009, Knowledge and Information Systems.

[28]  Ahmed Eldawy,et al.  A Demonstration of SpatialHadoop: An Efficient MapReduce Framework for Spatial Data , 2013, Proc. VLDB Endow..

[29]  Euripides G. M. Petrakis,et al.  Qualitative Spatial Reasoning Using Topological and Directional Information in OWL , 2012, 2012 IEEE 24th International Conference on Tools with Artificial Intelligence.

[30]  Yue Wang,et al.  Semantic overlay network for large-scale spatial information indexing , 2013, Computers & Geosciences.

[31]  Shiyong Lu,et al.  RDFProv: A relational RDF store for querying and managing scientific workflow provenance , 2010, Data Knowl. Eng..

[32]  Carlos Viegas Damásio,et al.  SPARQL Commands in Jena Rules , 2015, KESW.

[33]  May Yuan,et al.  An ontology-enabled framework for a geospatial problem-solving environment , 2013, Comput. Environ. Urban Syst..

[34]  Amandeep S. Sidhu,et al.  Semantic representation of monogenean haptoral Bar image annotation , 2013, BMC Bioinformatics.

[35]  Zhenlong Li,et al.  Big Data and cloud computing: innovation opportunities and challenges , 2017, Int. J. Digit. Earth.

[36]  Paloma Martínez,et al.  DINTO: Using OWL Ontologies and SWRL Rules to Infer Drug-Drug Interactions and Their Mechanisms , 2015, J. Chem. Inf. Model..

[37]  Nicola Guarino,et al.  Ontologies and Knowledge Bases. Towards a Terminological Clarification , 1995 .

[38]  Ralph Deters,et al.  Topics and Terms Mining in Unstructured Data Stores , 2013, 2013 IEEE 16th International Conference on Computational Science and Engineering.

[39]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[40]  Jagdev Bhogal,et al.  Handling Big Data Using NoSQL , 2015, 2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops.

[41]  Deborah L. McGuinness,et al.  Ontology of fractures , 2009 .

[42]  Roberto Giachetta,et al.  A framework for processing large scale geospatial and remote sensing data in MapReduce environment , 2015, Comput. Graph..

[43]  Noppol Thangsupachai,et al.  Learning Object Metadata Mapping for Linked Open Data , 2014, ICADL.

[44]  Wan Li Song,et al.  Semantic Query and Reasoning System based on Domain Ontology , 2015 .

[45]  Yang Ou,et al.  To ontologise or not to ontologise: An information model for a geospatial knowledge infrastructure , 2012, Comput. Geosci..

[46]  Young-Guk Ha,et al.  Transitivity Reasoning for RDF Ontology with Iterative MapReduce , 2013, 2013 Seventh International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing.

[47]  W. Li,et al.  Semantic-based web service discovery and chaining for building an Arctic spatial data infrastructure , 2011, Comput. Geosci..

[48]  Sugam Sharma,et al.  Expanded cloud plumes hiding Big Data ecosystem , 2016, Future Gener. Comput. Syst..

[49]  Chaoling Li The technical infrastructure of geological survey information grid , 2010, 2010 18th International Conference on Geoinformatics.

[50]  Alan L. Cox,et al.  The Hadoop distributed filesystem: Balancing portability and performance , 2010, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).

[51]  Daniela Giordano,et al.  Learning About the Semantic Web in an Information Systems Oriented Curriculum: A Case Study , 2014, CSEDU.

[52]  Joel H. Saltz,et al.  Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce , 2013, Proc. VLDB Endow..

[53]  Bo Zhao,et al.  Geo Ontology Design and Comparison in Geographic Information Integration , 2007, Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007).

[54]  Jinjun Chen,et al.  A security framework in G-Hadoop for big data computing across distributed Cloud data centres , 2014, J. Comput. Syst. Sci..

[55]  Ralph Deters,et al.  Terms Mining in Document-Based NoSQL: Response to Unstructured Data , 2014, 2014 IEEE International Congress on Big Data.

[56]  Timothy W. Finin,et al.  Enabling Technology for Knowledge Sharing , 1991, AI Mag..

[57]  SANDEEP R SIRSAT,et al.  Mining knowledge from text repositories using information extraction: A review , 2014 .

[58]  Konstantinos Evangelidis,et al.  Geospatial services in the Cloud , 2014, Comput. Geosci..