Managing and using provenance in the semantic web

The Web contains some extremely valuable information; however, often poor quality, inaccurate, irrelevant or fraudulent information can also be found. With the increasing amount of data available, it is becoming more and more difficult to distinguish truth from speculation on the Web. One of the most, if not the most, important criterion used to evaluate data credibility is the information source, i.e., the data origin. Trust in the information source is a valuable currency users have to evaluate such data. Data popularity, recency (or the time of validity), reliability, or vagueness ascribed to the data may also help users to judge the validity and appropriateness of information sources. We call this knowledge derived from the data the provenance of the data. Provenance is an important aspect of the Web. It is essential in identifying the suitability, veracity, and reliability of information, and in deciding whether information is to be trusted, reused, or even integrated with other information sources. Therefore, models and frameworks for representing, managing, and using provenance in the realm of Semantic Web technologies and applications are critically required. This thesis highlights the benefits of the use of provenance in different Web applications and scenarios. In particular, it presents management frameworks for querying and reasoning in the Semantic Web with provenance, and presents a collection of Semantic Web tools that explore provenance information when ranking and updating caches of Web data. To begin, this thesis discusses a highly exible and generic approach to the treatment of provenance when querying RDF datasets. The approach re-uses existing RDF modeling possibilities in order to represent provenance. It extends SPARQL query processing in such a way that given a SPARQL query for data, one may request provenance without modifying it. The use of provenance within SPARQL queries helps users to understand how RDF facts arederived, i.e., it describes the data and the operations used to produce the derived facts. Turning to more expressive Semantic Web data models, an optimized algorithm for reasoning and debugging OWL ontologies with provenance is presented. Typical reasoning tasks over an expressive Description Logic (e.g., using tableau methods to perform consistency checking, instance checking, satisfiability checking, and so on) are in the worst case doubly exponential, and in practice are often likewise very expensive. With the algorithm described in this thesis, however, one can efficiently reason in OWL ontologies with provenance, i.e., provenance is efficiently combined and propagated within the reasoning process. Users can use the derived provenance information to judge the reliability of inferences and to find errors in the ontology. Next, this thesis tackles the problem of providing to Web users the right content at the right time. The challenge is to efficiently rank a stream of messages based on user preferences. Provenance is used to represent preferences, i.e., the user defines his preferences over the messages' popularity, recency, etc. This information is then aggregated to obtain a joint ranking. The aggregation problem is related to the problem of preference aggregation in Social Choice Theory. The traditional problem formulation of preference aggregation assumes a I fixed set of preference orders and a fixed set of domain elements (e.g. messages). This work, however, investigates how an aggregated preference order has to be updated when the domain is dynamic, i.e., the aggregation approach ranks messages 'on the y' as the message passes through the system. Consequently, this thesis presents computational approaches for online preference aggregation that handle the dynamic setting more efficiently than standard ones. Lastly, this thesis addresses the scenario of caching data from the Linked Open Data (LOD) cloud. Data on the LOD cloud changes frequently and applications relying on that data - by pre-fetching data from the Web and storing local copies of it in a cache - need to continually update their caches. In order to make best use of the resources (e.g., network bandwidth for fetching data, and computation time) available, it is vital to choose a good strategy to know when to fetch data from which data source. A strategy to cope with data changes is to check for provenance. Provenance information delivered by LOD sources can denote when the resource on the Web has been changed last. Linked Data applications can benefit from this piece of information since simply checking on it may help users decide which sources need to be updated. For this purpose, this work describes an investigation of the availability and reliability of provenance information in the Linked Data sources. Another strategy for capturing data changes is to exploit provenance in a time-dependent function. Such a function should measure the frequency of the changes of LOD sources. This work describes, therefore, an approach to the analysis of data dynamics, i.e., the analysis of the change behavior of Linked Data sources over time, followed by the investigation of different scheduling update strategies to keep local LOD caches up-to-date. This thesis aims to prove the importance and benefits of the use of provenance in different Web applications and scenarios. The exibility of the approaches presented, combined with their high scalability, make this thesis a possible building block for the Semantic Web proof layer cake - the layer of provenance knowledge.

[1]  Jens Lehmann,et al.  LODStats - An Extensible Framework for High-Performance Dataset Analytics , 2012, EKAW.

[2]  Luc Moreau,et al.  Provenance for Online Decision Making , 2014, IPAW.

[3]  Tova Milo,et al.  Labeling Workflow Views with Fine-Grained Dependencies , 2012, Proc. VLDB Endow..

[4]  Jürgen Umbrich,et al.  Observing Linked Data Dynamics , 2013, ESWC.

[5]  Luca de Alfaro,et al.  A content-driven reputation system for the wikipedia , 2007, WWW '07.

[6]  Gerd Gröner,et al.  Ranking RDF with Provenance via Preference Aggregation , 2012, EKAW.

[7]  Egor V. Kostylev Annotation algebras for RDFS , 2010 .

[8]  Vassilis Christophides,et al.  On Provenance of Queries on Semantic Web Data , 2011, IEEE Internet Computing.

[9]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[10]  Grigoris Antoniou,et al.  Representing Time for the Semantic Web , 2015, MIWAI.

[11]  Ian Horrocks,et al.  The Even More Irresistible SROIQ , 2006, KR.

[12]  Szymon Klarman,et al.  Towards a Unifying Approach to Representing and Querying Temporal Data in Description Logics , 2012, RR.

[13]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.

[14]  Divesh Srivastava,et al.  Integrating Conflicting Data: The Role of Source Dependence , 2009, Proc. VLDB Endow..

[15]  Franz Baader,et al.  Temporalizing Ontology-Based Data Access , 2013, CADE.

[16]  Pascal Hitzler,et al.  Algorithms for Paraconsistent Reasoning with OWL , 2007, ESWC.

[17]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .

[18]  Alexandre Passant,et al.  Modelling provenance of DBpedia resources using Wikipedia contributions , 2011, J. Web Semant..

[19]  Steffen Staab,et al.  Querying for meta knowledge , 2008, WWW.

[20]  Jennifer Widom,et al.  Practical lineage tracing in data warehouses , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[21]  Edward A. Fox,et al.  Using digital library components for biodiversity systems , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[22]  Luc Moreau,et al.  A Provenance-Based Compliance Framework , 2010, FIS.

[23]  Juliana Freire,et al.  Towards Integrating Workflow and Database Provenance , 2012, IPAW.

[24]  Wolfram Wöß,et al.  RDFStats - An Extensible RDF Statistics Generator and Library , 2009, 2009 20th International Workshop on Database and Expert Systems Application.

[25]  Francesca Rossi,et al.  Influence and aggregation of preferences over combinatorial domains , 2012, AAMAS.

[26]  Jeff Z. Pan,et al.  Reasoning about uncertain information and conflict resolution through trust revision , 2013, AAMAS.

[27]  Anthony K. H. Tung,et al.  Categorical skylines for streaming data , 2008, SIGMOD Conference.

[28]  Sarah Cohen Boulakia,et al.  Provenance in Scientific Databases , 2009, Encyclopedia of Database Systems.

[29]  Heiko Paulheim,et al.  Adoption of the Linked Data Best Practices in Different Topical Domains , 2014, SEMWEB.

[30]  Paul T. Groth,et al.  TripleProv: efficient processing of lineage queries in a native RDF store , 2014, WWW.

[31]  Grigoris Karvounarakis,et al.  Semiring-annotated data: queries and provenance? , 2012, SGMD.

[32]  Yogesh L. Simmhan,et al.  A survey of data provenance in e-science , 2005, SGMD.

[33]  Luis Gravano,et al.  Top-k selection queries over relational databases: Mapping strategies and performance evaluation , 2002, TODS.

[34]  Jennifer Widom,et al.  Tracing the lineage of view data in a warehousing environment , 2000, TODS.

[35]  Alexandros Ntoulas,et al.  Effective Change Detection Using Sampling , 2002, VLDB.

[36]  Egor V. Kostylev,et al.  Combining dependent annotations for relational algebra , 2012, ICDT '12.

[37]  Kai Eckert,et al.  RESTful open workflows for data provenance and reuse , 2014, WWW '14 Companion.

[38]  Ian Horrocks,et al.  The Irresistible SRIQ , 2005, OWLED.

[39]  Maribel Acosta,et al.  WikiWho: precise and efficient attribution of authorship of revisioned content , 2014, WWW.

[40]  Kostyantyn Shchekotykhin,et al.  Query Strategy for Sequential Ontology Debugging , 2010, International Semantic Web Conference.

[41]  Kevin Chen-Chuan Chang,et al.  RankSQL: query algebra and optimization for relational top-k queries , 2005, SIGMOD '05.

[42]  Raphaël Troncy,et al.  Roomba: An Extensible Framework to Validate and Build Dataset Profiles , 2015, ESWC.

[43]  Wang Chiew Tan,et al.  An annotation management system for relational databases , 2004, The VLDB Journal.

[44]  Norbert Fuhr,et al.  A probabilistic relational algebra for the integration of information retrieval and database systems , 1997, TOIS.

[45]  Boris Motik,et al.  Metalevel Information in Ontology-Based Applications , 2008, AAAI.

[46]  Paul T. Groth,et al.  Executing Provenance-Enabled Queries over Web Data , 2015, WWW.

[47]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[48]  Sheila A. McIlraith,et al.  SPARQL with Qualitative and Quantitative Preferences , 2013, OrdRing@ISWC.

[49]  Sanjeev Khanna,et al.  Data Provenance: Some Basic Issues , 2000, FSTTCS.

[50]  Mudhakar Srivatsa,et al.  Assessing Trust in Uncertain Information , 2010, SEMWEB.

[51]  Marcelo Arenas,et al.  On the Semantics of SPARQL , 2009, Semantic Web Information Management.

[52]  Juliana Freire,et al.  noWorkflow: Capturing and Analyzing Provenance of Scripts , 2014, IPAW.

[53]  Jeff Z. Pan,et al.  Querying the Semantic Web with Preferences , 2006, SEMWEB.

[54]  Rafael Peñaloza,et al.  A Generic Approach for Correcting Access Restrictions to a Consequence , 2010, ESWC.

[55]  Alberto O. Mendelzon,et al.  Foundations of Semantic Web databases , 2011, J. Comput. Syst. Sci..

[56]  Sarvapali D. Ramchurn,et al.  Interpretation of Crowdsourced Activities Using Provenance Network Analysis , 2013, HCOMP.

[57]  Rafael Peñaloza,et al.  Axiom Pinpointing in General Tableaux , 2007, TABLEAUX.

[58]  Stuart E. Madnick,et al.  A Polygen Model for Heterogeneous Database Systems: The Source Tagging Perspective , 1990, VLDB.

[59]  Johanna Völker,et al.  Knowledge Engineering and Knowledge Management , 2012, Lecture Notes in Computer Science.

[60]  Grigoris Antoniou,et al.  Why-provenance information for RDF, rules, and negation , 2014, Annals of Mathematics and Artificial Intelligence.

[61]  Ansgar Scherp,et al.  Strategies for Efficiently Keeping Local Linked Open Data Caches Up-To-Date , 2015, International Semantic Web Conference.

[62]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[63]  Ying Zhang,et al.  Time series analysis of a Web search engine transaction log , 2009, Inf. Process. Manag..

[64]  Hector Garcia-Molina,et al.  Effective page refresh policies for Web crawlers , 2003, TODS.

[65]  Daniel Sonntag,et al.  Semiotic-based Ontology Evaluation Tool (S-OntoEval) , 2008, LREC.

[66]  Daniel Deutch,et al.  selP: Selective tracking and presentation of data provenance , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[67]  Paul T. Groth,et al.  PROV-O-Viz - Understanding the Role of Activities in Provenance , 2014, IPAW.

[68]  Dan Brickley,et al.  Rdf vocabulary description language 1.0 : Rdf schema , 2004 .

[69]  Deborah L. McGuinness,et al.  Explaining Conclusions from Diverse Knowledge Sources , 2006, International Semantic Web Conference.

[70]  Yogesh L. Simmhan,et al.  Karma2: Provenance Management for Data-Driven Workflows , 2008, Int. J. Web Serv. Res..

[71]  Camélia Constantin,et al.  WebLab PROV: computing fine-grained provenance links for XML artifacts , 2013, EDBT '13.

[72]  Guido Moerkotte,et al.  Characteristic sets: Accurate cardinality estimation for RDF queries with multiple joins , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[73]  Rafael Peñaloza,et al.  A Generic Approach for Large-Scale Ontological Reasoning in the Presence of Access Restrictions to the Ontology's Axioms , 2009, International Semantic Web Conference.

[74]  Val Tannen,et al.  Provenance semirings , 2007, PODS.

[75]  Jeff Heflin,et al.  LUBM: A benchmark for OWL knowledge base systems , 2005, J. Web Semant..

[76]  Ansgar Scherp,et al.  LOVER: support for modeling data using linked open vocabularies , 2013, EDBT '13.

[77]  Yun Peng,et al.  On Homeland Security and the Semantic Web: A Provenance and Trust Aware Inference Framework , 2005, AAAI Spring Symposium: AI Technologies for Homeland Security.

[78]  Vassilis Christophides,et al.  Coloring RDF Triples to Capture Provenance , 2009, SEMWEB.

[79]  Steffen Stadtmüller,et al.  On the Diversity and Availability of Temporal Information in Linked Open Data , 2012, SEMWEB.

[80]  Umberto Straccia,et al.  AnQL: SPARQLing Up Annotated RDFS , 2010, SEMWEB.

[81]  Paul N. Bennett,et al.  Predicting content change on the web , 2013, WSDM.

[82]  Sanjeev Khanna,et al.  Enabling Privacy in Provenance-Aware Workflow Systems , 2011, CIDR.

[83]  Steffen Staab,et al.  Using provenance to debug changing ontologies , 2011, J. Web Semant..

[84]  Steffen Staab,et al.  LENA-TR : Browsing Linked Open Data Along Knowledge-Aspects , 2010, AAAI Spring Symposium: Linked Data Meets Artificial Intelligence.

[85]  Asunción Gómez-Pérez,et al.  Loupe - An Online Tool for Inspecting Datasets in the Linked Data Cloud , 2015, SEMWEB.

[86]  Val Tannen,et al.  Annotated XML: queries and provenance , 2008, PODS.

[87]  Thomas Gottron,et al.  An Investigation of HTTP Header Information for Detecting Changes of Linked Open Data Sources , 2014, ESWC.

[88]  George Cybenko,et al.  How dynamic is the Web? , 2000, Comput. Networks.

[89]  Gerd Gröner,et al.  Change-a-LOD: Does the Schema on the Linked Data Cloud Change or Not? , 2013, COLD.

[90]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[91]  Andre Bolles,et al.  Streaming SPARQL - Extending SPARQL to Process Data Streams , 2008, ESWC.

[92]  Tudor Groza,et al.  SemVersion: RDF-based ontology versioning system , 2006 .

[93]  Gerd Gröner,et al.  Which of the following SPARQL Queries are Similar? Why? , 2013, LD4IE@ISWC.

[94]  Carsten Lutz,et al.  A Description Logic of Change , 2007, Description Logics.

[95]  Jorge S. Cardoso,et al.  Integrating business process and user interface models using a model-driven approach , 2009, 2009 24th International Symposium on Computer and Information Sciences.

[96]  Daniel Deutch,et al.  Provenance-based analysis of data-centric processes , 2015, The VLDB Journal.

[97]  Simon Miles,et al.  PROV-AQ: Provenance Access and Query , 2013 .

[98]  Jürgen Umbrich,et al.  LDspider: An Open-source Crawling Framework for the Web of Linked Data , 2010, SEMWEB.

[99]  Vagelis Hristidis,et al.  PREFER: a system for the efficient execution of multi-parametric ranked queries , 2001, SIGMOD '01.

[100]  A. James 2010 , 2011, Philo of Alexandria: an Annotated Bibliography 2007-2016.

[101]  Olaf Hartig,et al.  Towards Interoperable Provenance Publication on the Linked Data Web , 2012, LDOW.

[102]  Jeremy J. Carroll,et al.  TriX: RDF Triples in XML , 2004 .

[103]  Carsten Lutz,et al.  Temporal Description Logics: A Survey , 2008, 2008 15th International Symposium on Temporal Representation and Reasoning.

[104]  Steffen Staab,et al.  Reasoning With Provenance, Trust and all that other Meta Knowlege in OWL , 2009, SWPM.

[105]  Jiawei Han,et al.  Answering top-k queries with multi-dimensional selections: the ranking cube approach , 2006, VLDB.

[106]  Philip S. Yu,et al.  Optimal crawling strategies for web search engines , 2002, WWW '02.

[107]  Pramodita Sharma 2012 , 2013, Les 25 ans de l’OMC: Une rétrospective en photos.

[108]  Peter F. Patel-Schneider,et al.  OWL 2 Web Ontology Language New Features and Rationale , 2009 .

[109]  Jun Zhao,et al.  Describing Linked Datasets On the Design and Usage of voiD, the "Vocabulary Of Interlinked Datasets" , 2009 .

[110]  Umberto Straccia,et al.  A General Framework for Representing, Reasoning and Querying with Annotated Semantic Web Data , 2011, J. Web Semant..

[111]  Marcelo Arenas,et al.  Semantics and Complexity of SPARQL , 2006, International Semantic Web Conference.

[112]  Felix Naumann,et al.  Exploring Linked Data Graph Structures , 2015, International Semantic Web Conference.

[113]  Abraham Bernstein,et al.  Applied Temporal RDF: Efficient Temporal Querying of RDF Data with SPARQL , 2009, ESWC.

[114]  Daniel Deutch,et al.  Putting Lipstick on Pig: Enabling Database-style Workflow Provenance , 2011, Proc. VLDB Endow..

[115]  Bijan Parsia,et al.  Finding All Justifications of OWL DL Entailments , 2007, ISWC/ASWC.

[116]  Luc Moreau,et al.  Towards the Domain Agnostic Generation of Natural Language Explanations from Provenance Graphs for Casual Users , 2016, IPAW.

[117]  Steffen Staab,et al.  SPLENDID: SPARQL Endpoint Federation Exploiting VOID Descriptions , 2011, COLD.

[118]  Sandeep Pandey,et al.  User-centric Web crawling , 2005, WWW '05.

[119]  Hausi A. Müller,et al.  Dynamis: Effective Context-Aware Web Service Selection Using Dynamic Attributes , 2015, Future Internet.

[120]  Felix Naumann,et al.  Profiling and mining RDF data with ProLOD++ , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[121]  Jürgen Umbrich,et al.  Optimizing SPARQL Query Processing on Dynamic and Static Data Based on Query Time/Freshness Requirements Using Materialization , 2014, JIST.

[122]  Lois M. L. Delcambre,et al.  User Trust and Judgments in a Curated Database with Explicit Provenance , 2013, In Search of Elegance in the Theory and Practice of Computation.

[123]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[124]  Ansgar Scherp,et al.  LODatio: using a schema-level index to support users infinding relevant sources of linked data , 2013, K-CAP.

[125]  Ton Storcken,et al.  Update monotone preference rules , 2013, Math. Soc. Sci..

[126]  Brian D. Davison,et al.  Topical TrustRank: using topicality to combat web spam , 2006, WWW '06.

[127]  Gerd Gröner,et al.  SPACE: SPARQL Index for Efficient Autocompletion , 2013, International Semantic Web Conference.

[128]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[129]  Marc Najork,et al.  A large‐scale study of the evolution of Web pages , 2003, WWW '03.

[130]  Daniele Braga,et al.  C-SPARQL: a Continuous Query Language for RDF Data Streams , 2010, Int. J. Semantic Comput..

[131]  Richard Booth,et al.  A Bad Day Surfing Is Better than a Good Day Working: How to Revise a Total Preorder , 2006, KR.

[132]  Hector Garcia-Molina,et al.  Synchronizing a database to improve freshness , 2000, SIGMOD '00.

[133]  Mariacarla Calzarossa,et al.  Modeling and predicting temporal patterns of web content changes , 2015, J. Netw. Comput. Appl..

[134]  Yogesh L. Simmhan,et al.  The Open Provenance Model core specification (v1.1) , 2011, Future Gener. Comput. Syst..

[135]  James Cheney,et al.  PROV-N: The Provenance Notation , 2013 .

[136]  Bertram Ludäscher,et al.  Actor-Oriented Design of Scientific Workflows , 2005, ER.

[137]  Guilin Qi,et al.  Extending Description Logics with Uncertainty Reasoning in Possibilistic Logic , 2007, ECSQARU.

[138]  Deborah L. McGuinness,et al.  A proof markup language for Semantic Web services , 2006, Inf. Syst..

[139]  James Cheney,et al.  Database Queries that Explain their Work , 2014, PPDP '14.

[140]  James Cheney,et al.  Dynamic Provenance for SPARQL Updates , 2014, International Semantic Web Conference.

[141]  Jürgen Umbrich,et al.  Hybrid SPARQL Queries: Fresh vs. Fast Results , 2012, SEMWEB.

[142]  Jürgen Umbrich,et al.  Towards Dataset Dynamics: Change Frequency of Linked Open Data Sources , 2010, LDOW.

[143]  Deborah L. McGuinness,et al.  Explaining answers from the Semantic Web: the Inference Web approach , 2004, J. Web Semant..

[144]  Roy T. Fielding,et al.  Hypertext Transfer Protocol - HTTP/1.1 , 1997, RFC.

[145]  Vassilis Christophides,et al.  Provenance for Linked Data , 2013, In Search of Elegance in the Theory and Practice of Computation.

[146]  James Cheney,et al.  Provenance management in curated databases , 2006, SIGMOD Conference.

[147]  Bijan Parsia,et al.  Laconic and Precise Justifications in OWL , 2008, SEMWEB.

[148]  Stefano Bistarelli,et al.  A Semantic Foundation for Trust Management Languages with Weights: An Application to the RTFamily , 2008, ATC.

[149]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[150]  Prasenjit Mitra,et al.  Clustering-based incremental web crawling , 2010, TOIS.

[151]  Evelina Lamma,et al.  Reasoning with Probabilistic Ontologies , 2015, IJCAI.

[152]  Xiaojie Yuan,et al.  Answering regular path queries on workflow provenance , 2014, 2015 IEEE 31st International Conference on Data Engineering.

[153]  Steffen Staab,et al.  Provenance, Trust, Explanations - and all that other Meta Knowledge , 2009, Künstliche Intell..

[154]  Claudio Gutiérrez,et al.  Introducing Time into RDF , 2007, IEEE Transactions on Knowledge and Data Engineering.

[155]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[156]  James Frew,et al.  Lineage retrieval for scientific data processing: a survey , 2005, CSUR.

[157]  Jürgen Umbrich,et al.  An empirical survey of Linked Data conformance , 2012, J. Web Semant..

[158]  Gerd Gröner,et al.  From Changes to Dynamics: Dynamics Analysis of Linked Open Data Sources , 2014, PROFILES@ESWC.

[159]  Luc Moreau,et al.  PROV-JSONLD: A JSON and Linked Data Representation for Provenance , 2016, IPAW.

[160]  Maria-Esther Vidal,et al.  To Cache or Not To Cache: The Effects of Warming Cache in Complex SPARQL Queries , 2011, OTM Conferences.

[161]  Kyriakos Mouratidis,et al.  Continuous monitoring of top-k queries over sliding windows , 2006, SIGMOD Conference.

[162]  Toby Walsh,et al.  Eliminating the Weakest Link: Making Manipulation Intractable? , 2012, AAAI.

[163]  Sanjeev Khanna,et al.  On provenance and privacy , 2010, ICDT '11.

[164]  Li Ding,et al.  Characterizing the Semantic Web on the Web , 2006, SEMWEB.

[166]  Luc Moreau,et al.  ProvStore: A Public Provenance Repository , 2014, IPAW.

[167]  Shiyong Lu,et al.  RDFProv: A relational RDF store for querying and managing scientific workflow provenance , 2010, Data Knowl. Eng..

[168]  François Scharffe,et al.  SPARQL++ for Mapping Between RDF Vocabularies , 2007, OTM Conferences.

[169]  Umberto Straccia,et al.  Managing uncertainty and vagueness in description logics for the Semantic Web , 2008, J. Web Semant..

[170]  Daniel Deutch,et al.  Towards web-scale how-provenance , 2015, 2015 31st IEEE International Conference on Data Engineering Workshops.

[171]  Vassilis Christophides,et al.  Algebraic structures for capturing the provenance of SPARQL queries , 2013, ICDT '13.

[172]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[173]  Yolanda Gil,et al.  PROV-DM: The PROV Data Model , 2013 .

[174]  Evelina Lamma,et al.  Probabilistic Description Logics under the distribution semantics , 2015, Semantic Web.

[175]  Ronen I. Brafman,et al.  Preference Handling - An Introductory Tutorial , 2009, AI Mag..

[176]  James Cheney,et al.  PROV-Dictionary: Modeling Provenance for Dictionary Data Structures , 2014 .

[177]  Vassilis Christophides,et al.  On Computing Deltas of RDF/S Knowledge Bases , 2011, TWEB.

[178]  Hector Garcia-Molina,et al.  Estimating frequency of change , 2003, TOIT.

[179]  James A. Hendler,et al.  Accuracy of Metrics for Inferring Trust and Reputation in Semantic Web-Based Social Networks , 2004, EKAW.

[180]  Bijan Parsia,et al.  Beyond Asserted Axioms: Fine-Grain Justifications for OWL-DL Entailments , 2006, Description Logics.

[181]  Yannis Stavrakas,et al.  A Flexible Framework for Understanding the Dynamics of Evolving RDF Datasets , 2015, International Semantic Web Conference.

[182]  James Cheney,et al.  Provenance in Databases: Why, How, and Where , 2009, Found. Trends Databases.

[183]  V. Vianu,et al.  Edinburgh Why and Where: A Characterization of Data Provenance , 2017 .

[184]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[185]  Andrea Maurino,et al.  ABSTAT: Linked Data Summaries with ABstraction and STATistics , 2015, ESWC.

[186]  Cláudio T. Silva,et al.  Managing the Evolution of Dataflows with VisTrails , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[187]  Daniel Deutch,et al.  On provenance minimization , 2012, TODS.

[188]  Stefan Schlobach,et al.  LOD Laundromat: A Uniform Way of Publishing Other People's Dirty Data , 2014, SEMWEB.

[189]  George Cybenko,et al.  Keeping up with the changing Web , 2000, Computer.

[190]  James Cheney,et al.  On the expressiveness of implicit provenance in query and update languages , 2008, TODS.

[191]  Paul T. Groth,et al.  Provenance: An Introduction to PROV , 2013, Provenance.

[192]  Steffen Staab,et al.  SchemEX - Efficient construction of a data catalogue by stream-based indexing of linked data , 2012, J. Web Semant..

[193]  Parag Agrawal,et al.  Trio: a system for data, uncertainty, and lineage , 2006, VLDB.

[194]  Toby Walsh,et al.  Aggregating partially ordered preferences: impossibility and possibility results , 2005, TARK.

[195]  Sanjeev Khanna,et al.  Edinburgh Research Explorer On the Propagation of Deletions and Annotations through Views , 2013 .

[196]  Roy T. Fielding,et al.  Uniform Resource Identifier (URI): Generic Syntax , 2005, RFC.

[197]  Lee Feigenbaum,et al.  SCOVO: Using Statistics on the Web of Data , 2009, ESWC.

[198]  Sandeep Pandey,et al.  Recrawl scheduling based on information longevity , 2008, WWW.

[199]  Alfred Kobsa,et al.  Provenance and Annotation of Data and Processes , 2012, Lecture Notes in Computer Science.

[200]  Felix Naumann,et al.  Creating voiD descriptions for Web-scale data , 2011, J. Web Semant..

[201]  Juliana Freire,et al.  Provenance and scientific workflows: challenges and opportunities , 2008, SIGMOD Conference.

[202]  Paul T. Groth,et al.  The rationale of PROV , 2015, J. Web Semant..

[203]  Jennifer Golbeck,et al.  Trust-based Revision for Expressive Web Syndication , 2009, J. Log. Comput..

[204]  Jean Christoph Jung,et al.  Ontology-Based Access to Probabilistic Data with OWL QL , 2012, SEMWEB.

[205]  Steffen Staab,et al.  Querying for provenance, trust, uncertainty and other meta knowledge in RDF , 2009, J. Web Semant..

[206]  Michael Hausenblas,et al.  Describing linked datasets with the VoID vocabulary , 2011 .

[207]  Jan Chomicki,et al.  Monotonic and Nonmonotonic Preference Revision , 2005, ArXiv.

[208]  Yves Raimond,et al.  RDF 1.1 Primer , 2014 .

[209]  Mariano P. Consens,et al.  ExpLOD: Summary-Based Exploration of Interlinking and RDF Usage in the Linked Open Data Cloud , 2010, ESWC.

[210]  Dimitris Plexousakis,et al.  Provenance Management for Evolving RDF Datasets , 2016, ESWC.

[211]  Luc Moreau,et al.  A-posteriori provenance-enabled linking of publications and datasets via crowdsourcing , 2015 .

[212]  Malte Knauf,et al.  A Systematic Investigation of Explicit and Implicit Schema Information on the Linked Open Data Cloud , 2013, ESWC.

[213]  Luc Moreau,et al.  A Formal Account of the Open Provenance Model , 2015, TWEB.

[214]  Daniele Braga,et al.  Incremental Reasoning on Streams and Rich Background Knowledge , 2010, ESWC.

[215]  Grigoris Antoniou,et al.  Provenance for SPARQL queries , 2012, International Semantic Web Conference.

[216]  Susan B. Davidson,et al.  Zoom*UserViews: Querying Relevant Provenance in Workflow Systems , 2007, VLDB.