Decision Provenance: Capturing data flow for accountable systems

Demand is growing for more accountability in the technological systems that increasingly occupy our world. However, the complexity of many of these systems - often systems of systems - poses accountability challenges. This is because the details and nature of the data flows that interconnect and drive systems, which often occur across technical and organisational boundaries, tend to be opaque. This paper argues that data provenance methods show much promise as a technical means for increasing the transparency of these interconnected systems. Given concerns with the ever-increasing levels of automated and algorithmic decision-making, we make the case for decision provenance. This involves exposing the 'decision pipeline' by tracking the chain of inputs to, and flow-on effects from, the decisions and actions taken within these systems. This paper proposes decision provenance as a means to assist in raising levels of accountability, discusses relevant legal conceptions, and indicates some practical considerations for moving forward.

[1]  Francesco Bonchi,et al.  Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining , 2016, KDD.

[2]  J. Reidenberg,et al.  Accountable Algorithms , 2016 .

[3]  Geoff Holmes,et al.  Security and Data Accountability in Distributed Systems: A Provenance Survey , 2013, 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing.

[4]  David M. Eyers,et al.  Practical whole-system provenance capture , 2017, SoCC.

[5]  Ashish Gehani,et al.  SPADE: Support for Provenance Auditing in Distributed Environments , 2012, Middleware.

[6]  Raluca Diaconu,et al.  Big ideas paper: Policy-driven middleware for a legally-compliant Internet of Things , 2016, Middleware.

[7]  Keith Kirkpatrick,et al.  Battling algorithmic bias , 2016, Commun. ACM.

[8]  Seth Flaxman,et al.  EU regulations on algorithmic decision-making and a "right to explanation" , 2016, ArXiv.

[9]  Fei Li,et al.  Usability, Databases, and HCI , 2012, IEEE Data Eng. Bull..

[10]  Devarshi Ghoshal,et al.  Visualization of network data provenance , 2012, 2012 19th International Conference on High Performance Computing.

[11]  Margo I. Seltzer,et al.  FRAPpuccino: Fault-detection through Runtime Analysis of Provenance , 2017, HotCloud.

[12]  Julia Powles,et al.  "Meaningful Information" and the Right to Explanation , 2017, FAT.

[13]  Margo I. Seltzer,et al.  A primer on provenance , 2014, CACM.

[14]  Jatinder Singh,et al.  Camflow: Managed Data-Sharing for Cloud Services , 2015, IEEE Transactions on Cloud Computing.

[15]  Rob Kitchin,et al.  Getting smarter about smart cities: Improving data privacy and data security , 2016 .

[16]  Kyle Kubler The Black Box Society: the secret algorithms that control money and information , 2016 .

[17]  Marimuthu Palaniswami,et al.  Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[18]  David M. Eyers,et al.  Information Flow Audit for PaaS Clouds , 2016, 2016 IEEE International Conference on Cloud Engineering (IC2E).

[19]  Andreas Schreiber,et al.  Visualizing Provenance using Comics , 2017, TaPP.

[20]  José Maria N. David,et al.  A Framework for Provenance Analysis and Visualization , 2017, ICCS.

[21]  Franco Turini,et al.  Discrimination-aware data mining , 2008, KDD.

[22]  Luc Moreau,et al.  A Provenance-Based Compliance Framework , 2010, FIS.

[23]  Shouhuai Xu,et al.  A Characterization of the problem of secure provenance management , 2009, 2009 IEEE International Conference on Intelligence and Security Informatics.

[24]  Thomas Moyer,et al.  Trustworthy Whole-System Provenance for the Linux Kernel , 2015, USENIX Security Symposium.

[25]  Jon Crowcroft,et al.  Responsibility & Machine Learning: Part of a Process , 2016 .

[26]  Qi Wang,et al.  Fear and Logging in the Internet of Things , 2018, NDSS.

[27]  Chris Reed,et al.  Responsibility, Autonomy and Accountability: Legal Liability for Machine Learning , 2016 .

[28]  Sebastian Schelter,et al.  Automatically Tracking Metadata and Provenance of Machine Learning Experiments , 2017 .

[29]  Jatinder Singh,et al.  Data Flow Management and Compliance in Cloud Computing , 2015, IEEE Cloud Computing.

[30]  Tom Rodden,et al.  Provenance for the People: An HCI Perspective on the W3C PROV Standard through an Online Game , 2015, CHI.

[31]  Nicholas Diakopoulos,et al.  Accountability in algorithmic decision making , 2016, ACM Queue.

[32]  Ryan K. L. Ko,et al.  The Full Provenance Stack: Five Layers for Complete and Meaningful Provenance , 2017, SpaCCS Workshops.

[33]  David M. Eyers,et al.  Twenty Security Considerations for Cloud-Supported Internet of Things , 2016, IEEE Internet of Things Journal.

[34]  Mark W. Maier,et al.  Architecting Principles for Systems‐of‐Systems , 1996 .

[35]  Yasir Mehmood,et al.  Internet-of-Things-Based Smart Cities: Recent Advances and Challenges , 2017, IEEE Communications Magazine.

[36]  James A. Hendler,et al.  Information accountability , 2008, CACM.

[37]  S. New,et al.  The Transparent Supply Chain , 2010 .

[38]  Juliana Freire,et al.  Provenance and scientific workflows: challenges and opportunities , 2008, SIGMOD Conference.

[39]  David M. Eyers,et al.  Data provenance to audit compliance with privacy policy in the Internet of Things , 2018, Personal and Ubiquitous Computing.

[40]  Luciano Floridi,et al.  Why a Right to Explanation of Automated Decision-Making Does Not Exist in the General Data Protection Regulation , 2017 .

[41]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.