Entity Name System: The Back-Bone of an Open and Scalable Web of Data

Recognizing that information from different sources refers to the same (real world) entity is a crucial challenge in instance-level information integration, as it is a pre-requisite for combining the information about one entity from different sources. The required entity matching is time consuming and thus imposes a crucial limit for large-scale, dynamic information integration. An increased re-use of entity identifiers (or names) across different information collections such as RDF repositories, databases and document collections, eases this situation.In the ideal case, entity matching can be reduced to the trivial problem of spotting the same entity identifier in different information collections. In this paper we propose the use of an entity name system (ENS) - as it is currently under development in the EU-funded project OKKAM - for systematically supporting the re-use of entity identifiers. The main purpose of the ENS is to provide unique and uniform names for entities for the use in information collections, so that the same name is used for an entity, even when it is referenced in different contexts. Of course the creation of an ENS that can efficiently deal with entities on the Web scale raises scalability issues of its own. This paper focuses on the role of an ENS in contributing to the scalability of ad-hoc and on demand information integration tasks.