Using Data Provenance to Measure Information Assurance Attributes

Data Provenance is multi-dimensional metadata that specifies Information Assurance attributes like Confidentiality, Authenticity, Integrity, Non-Repudiation etc. It may also include ownership, processing details and other attributes. Further, each Information Assurance attribute may itself have sub-components like objective and subjective values or application security versus transport security. Traditionally, the Information Assurance attributes have been specified probabilistically as a belief value (or corresponding disbelief value) in that Information Assurance attribute. In this paper we introduce a framework based on Subjective Logic that incorporates uncertainty by representing values as a triple of . This framework also allows us to work with conflicting Information Assurance attribute values that may arise from multiple views of an object. We also introduce a formal semantic model for specifying and reasoning over Information assurance properties in a workflow. Data Provenance information can grow substantially as the amount of information kept for each object increases as well as the complexity of a workflow increases. In such situations, it may be necessary to summarize the Data Provenance information. Further, the summarization may depend on the Information Assurance attributes as well as the type of analysis used for Data Provenance. We show how such summarization can be done and how it can be used to generate trust value in the data. We also discuss how the Information Assurance values can be visualized.