Towards Neural Schema Alignment for OpenStreetMap and Knowledge Graphs

OpenStreetMap (OSM) is one of the richest openly available sources of volunteered geographic information. Although OSM includes various geographical entities, their descriptions are highly heterogeneous, incomplete, and do not follow any well-defined ontology. Knowledge graphs can potentially provide valuable semantic information to enrich OSM entities. However, interlinking OSM entities with knowledge graphs is inherently difficult due to the large, heterogeneous, ambiguous and flat OSM schema and the annotation sparsity. This paper tackles the alignment of OSM tags with the corresponding knowledge graph classes holistically by jointly considering the schema and instance layers. We propose a novel neural architecture that capitalizes upon a shared latent space for tag-to-class alignment created using linked entities in OSM and knowledge graphs. Our experiments performed to align OSM datasets for several countries with two of the most prominent openly available knowledge graphs, namely, Wikidata and DBpedia, demonstrate that the proposed approach outperforms the state-of-the-art schema alignment baselines by up to 53 percentage points in terms of F1-score. The resulting alignment facilitates new semantic annotations for over 10 million OSM entities worldwide, which is more than a 400% increase compared to the existing semantic annotations in OSM.

[1]  Markus Krötzsch,et al.  Wikidata , 2014 .

[2]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[3]  Stewart Massie,et al.  Ontology Alignment Based on Word Embedding and Random Forest Classification , 2018, ECML/PKDD.

[4]  Gerhard Weikum,et al.  YAGO 4: A Reason-able Knowledge Base , 2020, ESWC.

[5]  Jens Lehmann,et al.  Wombat - A Generalization Approach for Automatic Link Discovery , 2017, ESWC.

[6]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[7]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[8]  Elena Demidova,et al.  Linking OpenStreetMap with Knowledge Graphs - Link Discovery for Schema-Agnostic Volunteered Geographic Information , 2020, Future Gener. Comput. Syst..

[9]  Erhard Rahm,et al.  Similarity flooding: a versatile graph matching algorithm and its application to schema matching , 2002, Proceedings 18th International Conference on Data Engineering.

[10]  Sören Auer,et al.  LIMES - A Time-Efficient Approach for Large-Scale Link Discovery on the Web of Data , 2011, IJCAI.

[11]  Paolo Papotti,et al.  Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks , 2020, SIGMOD Conference.

[12]  Heiko Paulheim,et al.  Type Inference on Noisy RDF Data , 2013, SEMWEB.

[13]  Zohra Bellahsene,et al.  Opening the Black Box of Ontology Matching , 2013, ESWC.

[14]  Tinne Tuytelaars,et al.  Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[15]  Heiner Stuckenschmidt,et al.  Results of the Ontology Alignment Evaluation Initiative 2007 , 2006, OM.

[16]  Lorena Otero-Cerdeira,et al.  Ontology matching: A literature review , 2015, Expert Syst. Appl..

[17]  Markus Nentwig,et al.  A survey of current Link Discovery frameworks , 2016, Semantic Web.

[18]  Pedro M. Domingos,et al.  Ontology Matching: A Machine Learning Approach , 2004, Handbook on Ontologies.

[19]  Zhifang Sui,et al.  ERSOM: A Structural Ontology Matching Approach Using Automatically Learned Entity Representation , 2015, EMNLP.

[20]  Matthias Samwald,et al.  Dividing the Ontology Alignment Task with Semantic Embeddings and Logic-based Modules , 2020, ECAI.

[21]  Krisztian Balog,et al.  Web Table Extraction, Retrieval, and Augmentation: A Survey , 2020, ACM Trans. Intell. Syst. Technol..

[22]  Krisztian Balog,et al.  Web Table Extraction, Retrieval, and Augmentation , 2020, ACM Trans. Intell. Syst. Technol..

[23]  Martin Gaedke,et al.  Silk - A Link Discovery Framework for the Web of Data , 2009, LDOW.

[24]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[25]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[26]  Wolfgang Nejdl,et al.  Aligning freebase with the YAGO ontology , 2013, CIKM.

[27]  Amal Zouaq,et al.  Ontology Matching Using Convolutional Neural Networks , 2020, LREC.

[28]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[29]  Jens Lehmann,et al.  LinkedGeoData: A core for a web of spatial open data , 2012, Semantic Web.

[30]  Erhard Rahm,et al.  Generic Schema Matching with Cupid , 2001, VLDB.

[31]  Simon Gottschalk,et al.  EventKG - the Hub of Event Knowledge on the Web - and Biographical Timeline Generation , 2019, Semantic Web.

[32]  Lirong Qiu,et al.  Knowledge entity learning and representation for ontology matching based on deep neural networks , 2017, Cluster Computing.

[33]  Hamideh Afsarmanesh,et al.  Using linguistic techniques for schema matching , 2006, ICSOFT.