BLINKER: A Blockchain-Enabled Framework for Software Provenance

There has been a considerable shift in the way how software is built and delivered today. Most deployed software systems in modern times are created by (autonomous) distributed teams in heterogeneous environments making use of many artifacts, such as externally developed libraries, drawn from a variety of disparate sources. Stakeholders such as developers, managers, and clients across the software delivery value chain are interested in gaining insights such as how and why an artifact came to where it is, what other artifacts are related to it, and who else is using this. Software provenance encompasses the origins of artifacts, their evolution, and usage and is critical for comprehending, managing, decision-making, and analyzing software quality, processes, people, issues etc. In this paper, we propose an extensible framework based on standard provenance model specifications and blockchain technology for capturing, storing, exploring, and analyzing software provenance data. Our framework (i) enhances trustworthiness of provenance data (ii) uncovers non-trivial insights through inferences and reasoning, and (iii) enables interactive visualization of provenance insights. We demonstrate the utility of the proposed framework using open source project data.

[1]  Peng Xu,et al.  Provenance in Software Engineering - A Configuration Management View , 2005, AMCIS.

[2]  Murat Kantarcioglu,et al.  SmartProvenance: A Distributed, Blockchain Based DataProvenance System , 2018, CODASPY.

[3]  Luc Moreau,et al.  The Open Provenance Model: An Overview , 2008, IPAW.

[4]  Hajimu Iida,et al.  Who does what during a code review? Datasets of OSS peer review repositories , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[5]  Yolanda Gil,et al.  PROV-DM: The PROV Data Model , 2013 .

[6]  M. Godfrey,et al.  Bertillonage Determining the provenance of software development artifacts , 2011 .

[7]  A. Frank Ackerman,et al.  Software inspections: an effective verification process , 1989, IEEE Software.

[8]  Tim Menzies,et al.  Software Analytics: So What? , 2013, IEEE Softw..

[9]  Vibhu Saujanya Sharma,et al.  Software Development Analytics: Experiences and the Way Forward , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW).

[10]  Lin Luo,et al.  A code provenance management tool for ip-aware software development , 2008, ICSE Companion '08.

[11]  Sachin Shetty,et al.  ProvChain: A Blockchain-Based Data Provenance Architecture in Cloud Environment with Enhanced Privacy and Availability , 2017, 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[12]  Paul T. Groth,et al.  The rationale of PROV , 2015, J. Web Semant..

[13]  Paul Anderson,et al.  Tool Support for Fine-Grained Software Inspection , 2003, IEEE Softw..

[14]  Michael W. Godfrey Understanding software artifact provenance , 2015, Sci. Comput. Program..

[15]  Cláudia Maria Lima Werner,et al.  Software Processes Analysis with Provenance , 2018, PROFES.

[16]  José Maria N. David,et al.  A Framework for Provenance Analysis and Visualization , 2017, ICCS.

[17]  Cláudia Maria Lima Werner,et al.  Using Ontology and Data Provenance to Improve Software Processes , 2015, ONTOBRAS.

[18]  Heinrich Wendel,et al.  Using Provenance to Trace Software Development Processes Statement of Affirmation , 2022 .

[19]  Hajimu Iida,et al.  Mining the Modern Code Review Repositories: A Dataset of People, Process and Product , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[20]  B. Nicholson,et al.  Global IT Outsourcing: Software Development across Borders , 2003 .

[21]  Luke Church,et al.  Modern Code Review: A Case Study at Google , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[22]  Vibhu Saujanya Sharma,et al.  PIVoT: Project insights and Visualization Toolkit , 2012, 2012 3rd International Workshop on Emerging Trends in Software Metrics (WETSoM).

[23]  Paul T. Groth,et al.  PrIMe: A methodology for developing provenance-aware applications , 2011, TSEM.

[24]  Yolanda Gil,et al.  PROV Model Primer , 2012 .

[25]  James Cheney,et al.  The W3C PROV family of specifications for modelling provenance metadata , 2013, EDBT '13.

[26]  Michael W. Godfrey,et al.  Software Bertillonage , 2012, Empirical Software Engineering.