论文信息 - Comparing Index Structures for Completeness Reasoning

Comparing Index Structures for Completeness Reasoning

Data quality is a major issue in the devel- opment of knowledge graphs. Data completeness is a key factor in data quality pertaining to how broad and deep is information contained in knowledge graphs. As for large- scale knowledge graphs (e.g., DBpedia, Wikidata), it is conceivable that given the vast amount of information contained in there, they may be complete for a wide range of topics, such as children of Joko Widodo, cantons of Switzerland, and presidents of Indonesia. Previous research has shown how one can augment knowledge graphs with statements about their completeness, stating which parts of data are complete. Such meta-information can be leveraged to check query completeness, that is, whether the answer returned by a query is complete. Yet, it is still unclear how such a check can be done in practice, especially when many completeness statements are involved. We devise implementation techniques to make completeness reasoning in the presence of large sets of completeness statements feasible, and experimentally evaluate their effectiveness in realistic settings based on the characteristics of real-world knowledge graphs.

[1] Alistair Moffat,et al. An Efficient Indexing Technique for Full Text Databases , 1992, Very Large Data Bases Conference.

[2] Pablo de la Fuente,et al. An Empirical Study of Real-World SPARQL Queries , 2011, ArXiv.

[3] Jörg Hoffmann,et al. A New Method to Index and Query Sets , 1999, IJCAI.

[4] Werner Nutt,et al. CORNER: A Completeness Reasoner for SPARQL Queries Over RDF Data Sources , 2014, ESWC.

[5] Jeremy J. Carroll,et al. Resource description framework (rdf) concepts and abstract syntax , 2003 .

[6] Jens Lehmann,et al. DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[7] Werner Nutt,et al. Managing and Consuming Completeness Information for Wikidata Using COOL-WD , 2016, COLD@ISWC.

[8] Werner Nutt,et al. Completeness Statements about RDF Data Sources and Their Use for Query Answering , 2013, SEMWEB.

[9] Iztok Savnik,et al. Index Data Structure for Fast Subset and Superset Queries , 2013, CD-ARES.

[10] Diane M. Strong,et al. Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..

[11] Lei Zou,et al. Semantic SPARQL Similarity Search Over RDF Knowledge Graphs , 2016, Proc. VLDB Endow..

[12] Wenfei Fan,et al. Foundations of Data Quality Management , 2012, Foundations of Data Quality Management.

[13] Sven Helmer,et al. A performance study of four index structures for set-valued attributes of low cardinality , 2003, The VLDB Journal.