Towards Knowledge Graph Construction using Semantic Data Mining

Over the last few years, constructing knowledge graphs for new domains and linking them to existing ones has gained significant attention, especially in domains which have experienced a tremendous increase in available data such as biodiversity research. To this end, in this paper, we introduce a new semantic data mining-based approach to support the (semi-)automatic generation of a biodiversity knowledge graph. The proposed approach exploits and links information from several biodiversity-related resources, including the Encyclopedia of Life (EOL), the Global Biodiversity Information Facility (GBIF), and the Global Biotic Interactions (GLOBI). In particular, we adopt a data mining technique to extract association rules that support the construction of an initial species interactions knowledge graph. We then make use of available biodiversity resources to enrich the knowledge graph. We believe that this graph will support scientists from the biodiversity domain to gain new insights and enrich the data interoperability.

[1]  Eugenio Di Sciascio,et al.  Machine learning in the Internet of Things: A semantic-enhanced approach , 2018, Semantic Web.

[2]  Heiko Paulheim,et al.  Machine Learning with and for Semantic Web Knowledge Graphs , 2018, Reasoning Web.

[3]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[4]  Jano Moreira de Souza,et al.  Applying data mining techniques for spatial distribution analysis of plant species co-occurrences , 2016, Expert Syst. Appl..

[5]  Eduard Szöcs,et al.  taxize: taxonomic search and retrieval in R , 2013, F1000Research.

[6]  Chris Mungall,et al.  Global biotic interactions: An open infrastructure to share and analyze species-interaction datasets , 2014, Ecol. Informatics.

[7]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[8]  Kurt Hornik,et al.  The arules R-Package Ecosystem: Analyzing Interesting Patterns from Large Transaction Data Sets , 2011, J. Mach. Learn. Res..

[9]  Dimitris Kanellopoulos,et al.  Association Rules Mining: A Recent Overview , 2006 .

[10]  Paolo Giudici,et al.  Applied Data Mining for Business and Industry , 2009 .

[11]  Hao Wang,et al.  Semantic data mining: A survey of ontology-based approaches , 2015, Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015).

[12]  Craig A. Knoblock,et al.  Using a Knowledge Graph to Combat Human Trafficking , 2015, SEMWEB.

[13]  Mayank Kejriwal,et al.  Domain-Specific Knowledge Graph Construction , 2019, SpringerBriefs in Computer Science.

[14]  Heiko Paulheim,et al.  Knowledge graph refinement: A survey of approaches and evaluation methods , 2016, Semantic Web.

[15]  Steven J. Baskauf,et al.  Training and hackathon on building biodiversity knowledge graphs , 2019 .

[16]  Heiko Paulheim,et al.  How much is a Triple? Estimating the Cost of Knowledge Graph Creation , 2018, SEMWEB.

[17]  Heiko Paulheim,et al.  Semantic Web in data mining and knowledge discovery: A comprehensive survey , 2016, J. Web Semant..

[18]  Joel Sachs,et al.  Bringing a Semantic MediaWiki Flora to Life , 2018 .

[19]  Sotiris B. Kotsiantis,et al.  Data preprocessing in predictive data mining , 2019, The Knowledge Engineering Review.

[20]  Roderic D. M. Page,et al.  Towards a biodiversity knowledge graph , 2016 .

[21]  Roderic D.M. Page,et al.  Ozymandias: a biodiversity knowledge graph , 2018, bioRxiv.

[22]  Michael Wright,et al.  Linking DNA Metabarcoding and Text Mining to Create Network-Based Biomonitoring Tools: A Case Study on Boreal Wetland Macroinvertebrate Communities , 2018 .

[23]  Robert Isele,et al.  LDIF - Linked Data Integration Framework , 2011, COLD.