Everything you always wanted to know about blank nodes

In this paper we thoroughly cover the issue of blank nodes, which have been defined in RDF as 'existential variables'. We first introduce the theoretical precedent for existential blank nodes from first order logic and incomplete information in database theory. We then cover the different (and sometimes incompatible) treatment of blank nodes across the W3C stack of RDF-related standards. We present an empirical survey of the blank nodes present in a large sample of RDF data published on the Web (the BTC-2012 dataset), where we find that 25.7% of unique RDF terms are blank nodes, that 44.9% of documents and 66.2% of domains featured use of at least one blank node, and that aside from one Linked Data domain whose RDF data contains many "blank node cycles", the vast majority of blank nodes form tree structures that are efficient to compute simple entailment over. With respect to the RDF-merge of the full data, we show that 6.1% of blank-nodes are redundant under simple entailment. The vast majority of non-lean cases are isomorphisms resulting from multiple blank nodes with no discriminating information being given within an RDF document or documents being duplicated in multiple Web locations. Although simple entailment is NP-complete and leanness-checking is coNP-complete, in computing this latter result, we demonstrate that in practice, real-world RDF graphs are sufficiently "rich" in ground information for problematic cases to be avoided by non-naive algorithms.

[1]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[2]  Markus Krötzsch,et al.  SPARQL beyond Subgraph Matching , 2010, SEMWEB.

[3]  Zahir Tari,et al.  On the Move to Meaningful Internet Systems 2007: CoopIS, DOA, ODBASE, GADA, and IS, OTM Confederated International Conferences CoopIS, DOA, ODBASE, GADA, and IS 2007, Vilamoura, Portugal, November 25-30, 2007, Proceedings, Part I , 2007, OTM Conferences.

[4]  Brendan D. McKay,et al.  Practical graph isomorphism, II , 2013, J. Symb. Comput..

[5]  Jan van Leeuwen,et al.  Worst-case Analysis of Set Union Algorithms , 1984, JACM.

[6]  Bernardo Cuenca Grau,et al.  OWL 2 Web Ontology Language: Direct Semantics , 2009 .

[7]  Ian Horrocks,et al.  Description logic programs: combining logic programs with description logic , 2003, WWW '03.

[8]  Jürgen Umbrich,et al.  Towards a Dynamic Linked Data Observatory , 2012 .

[9]  Alberto O. Mendelzon,et al.  Foundations of semantic web databases , 2004, PODS.

[10]  Jaroslav Nesetril,et al.  The core of a graph , 1992, Discret. Math..

[11]  Serge Abiteboul,et al.  On the Representation and Querying of Sets of Possible Worlds , 1991, Theor. Comput. Sci..

[12]  Gösta Grahne,et al.  The Problem of Incomplete Information in Relational Databases , 1991, Lecture Notes in Computer Science.

[13]  RIF RDF and OWL Compatibility W3C 3 , 2022 .

[14]  Boris Motik,et al.  OWL 2 Web Ontology Language: structural specification and functional-style syntax , 2008 .

[15]  James A. Hendler,et al.  Parallel Materialization of the Finite RDFS Closure for Hundreds of Millions of Triples , 2009, SEMWEB.

[16]  Boris Motik,et al.  Hypertableau Reasoning for Description Logics , 2009, J. Artif. Intell. Res..

[17]  Jorge Pérez,et al.  Minimal Deductive Systems for RDF , 2007, ESWC.

[18]  Peter F. Patel-Schneider,et al.  OWL 2 Web Ontology Language , 2009 .

[19]  Tom Heath,et al.  Linked Data: Evolving the Web into a Global Data Space , 2011, Linked Data.

[20]  P. Kelly A congruence theorem for trees. , 1957 .

[21]  Jean-François Baget,et al.  RDF Entailment as a Graph Homomorphism , 2005, SEMWEB.

[22]  Ian Horrocks,et al.  Combining logic programs with description logics , 2003, The Web Conference.

[23]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[24]  Jeremy J. Carroll,et al.  OWL 2 Web Ontology Language RDF-Based Semantics , 2009 .

[25]  Ivan Herman,et al.  RDFa 1.1 Primer - Second Edition, W3C Note , 2013 .

[26]  Jorge Pérez,et al.  Simple and Efficient Minimal RDFS , 2009, J. Web Semant..

[27]  François Scharffe,et al.  SPARQL++ for Mapping Between RDF Vocabularies , 2007, OTM Conferences.

[28]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[29]  Axel Polleres,et al.  On Blank Nodes , 2011, SEMWEB.

[30]  Stefan Decker,et al.  Mapping between RDF and XML with XSPARQL , 2012, Journal on Data Semantics.

[31]  Volker Haarslev,et al.  The RacerPro knowledge representation and reasoning system , 2012, Semantic Web.

[32]  Jos de Bruijn,et al.  Logical Foundations of (e)RDF(S): Complexity and Reasoning , 2007, ISWC/ASWC.

[33]  William P. Bottom,et al.  Getting to the Core , 1996 .

[34]  Jürgen Umbrich,et al.  Observing Linked Data Dynamics , 2013, ESWC.

[35]  Deborah L. McGuinness,et al.  Owl web ontology language guide , 2003 .

[36]  Michael Kifer,et al.  Logical foundations of object-oriented and frame-based languages , 1995, JACM.

[37]  Tim Berners-Lee,et al.  Linked data , 2020, Semantic Web for the Working Ontologist.

[38]  Daniel P. Miranker,et al.  Ultrawrap: SPARQL execution on relational data , 2013, J. Web Semant..

[39]  Vibhav Gogate,et al.  A Complete Anytime Algorithm for Treewidth , 2004, UAI.

[40]  Peter F. Patel-Schneider,et al.  OWL 2 Web Ontology Language Mapping to RDF Graphs , 2009 .

[41]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[42]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[43]  Boris Motik,et al.  OWL 2 Web Ontology Language Direct Semantics , 2009 .

[44]  Philipp Obermeier,et al.  Processing RIF and OWL2RL within DLVHEX , 2010, RR.

[45]  Axel Polleres,et al.  dRDF: Entailment for Domain-Restricted RDF , 2008, ESWC.

[46]  Andrea Calì,et al.  Datalog+/-: A Family of Logical Knowledge Representation and Query Languages for New Applications , 2010, 2010 25th Annual IEEE Symposium on Logic in Computer Science.

[47]  Ian Horrocks,et al.  Optimizing Terminological Reasoning for Expressive Description Logics , 2007, Journal of Automated Reasoning.

[48]  Bernardo Cuenca Grau,et al.  OWL 2 Web Ontology Language: Profiles , 2009 .

[49]  Serge Abiteboul,et al.  On the representation and querying of sets of possible worlds , 1987, SIGMOD '87.

[50]  Ian Horrocks,et al.  SPARQL Query Answering over OWL Ontologies , 2011, ESWC.

[51]  Mariano P. Consens,et al.  Revisiting Blank Nodes in RDF to Avoid the Semantic Mismatch with SPARQL , 2010 .

[52]  Herman J. ter Horst,et al.  Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary , 2005, J. Web Semant..

[53]  Jeremy J. Carroll,et al.  Resource description framework (rdf) concepts and abstract syntax , 2003 .

[54]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[55]  Dan Brickley,et al.  Rdf vocabulary description language 1.0 : Rdf schema , 2004 .

[56]  Richard C. T. Lee,et al.  Herbrand's Theorem , 1973 .

[57]  Yannis Tzitzikas,et al.  Demonstrating Blank Node Matching and RDF/S Comparison Functions , 2012, International Semantic Web Conference.

[58]  Boris Motik,et al.  HermiT: An OWL 2 Reasoner , 2014, Journal of Automated Reasoning.

[59]  Samuel R. Buss,et al.  On Herbrand's Theorem , 1994, LCC.

[60]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .

[61]  Frank van Harmelen,et al.  Scalable Distributed Reasoning Using MapReduce , 2009, SEMWEB.

[62]  Yarden Katz,et al.  Pellet: A practical OWL-DL reasoner , 2007, J. Web Semant..

[63]  Ashok K. Chandra,et al.  Optimal implementation of conjunctive queries in relational data bases , 1977, STOC '77.

[64]  David Booth,et al.  Well Behaved RDF: A Straw-Man Proposal for Taming Blank Nodes , 2013 .

[65]  Jeremy J. Carroll,et al.  Signing RDF Graphs , 2003, SEMWEB.

[66]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[67]  Diego Calvanese,et al.  The description logic handbook: theory , 2003 .