Impact Analysis of Cross-Project Bugs on Software Ecosystems

Software projects are increasingly forming social-technical ecosystems within which individual projects rely on the infrastructures or functional components provided by other projects, leading to complex inter-dependencies. Through inter-project dependencies, a bug in an upstream project may have profound impact on a large number of downstream projects, resulting in cross-project bugs. This emerging type of bugs has brought new challenges in bug fixing due to their unclear influence on downstream projects. In this paper, we present an approach to estimating the impact of a cross-project bug within its ecosystem by identifying the affected downstream modules (classes/methods). Note that a downstream project that uses a buggy upstream function may not be affected as the usage does not satisfy the failure inducing preconditions. For a reported bug with the known root cause function and failure inducing preconditions, we first collect the candidate downstream modules that call the upstream function through an ecosystem-wide dependence analysis. Then, the paths to the call sites of the buggy upstream function are encoded as symbolic constraints. Solving the constraints, together with the failure inducing preconditions, identifies the affected downstream modules. Our evaluation of 31 existing upstream bugs on the scientific Python ecosystem containing 121 versions of 22 popular projects (with a total of 16 millions LOC) shows that the approach is highly effective: from the 25490 candidate downstream modules that invoke the buggy upstream functions, it identifies 1132 modules where the upstream bugs can be triggered, pruning 95.6% of the candidates. The technique has no false negatives and an average false positive rate of 7.9%. Only 49 downstream modules (out of the 1132 we found) were reported before to be affected.

[1]  Jens Dietrich,et al.  How Java APIs break - An empirical study , 2015, Inf. Softw. Technol..

[2]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[3]  Yuming Zhou,et al.  An Empirical Study on Downstream Workarounds for Cross-Project Bugs , 2017, 2017 24th Asia-Pacific Software Engineering Conference (APSEC).

[4]  James D. Herbsleb,et al.  How to break an API: cost negotiation and community values in three software ecosystems , 2016, SIGSOFT FSE.

[5]  Slinger Jansen,et al.  A sense of community: A research agenda for software ecosystems , 2009, 2009 31st International Conference on Software Engineering - Companion Volume.

[6]  Arie van Deursen,et al.  Software Ecosystem Call Graph for Dependency Management , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering: New Ideas and Emerging Technologies Results (ICSE-NIER).

[7]  Marco Tulio Valente,et al.  On the use of replacement messages in API deprecation: An empirical study , 2018, J. Syst. Softw..

[8]  Gabriele Bavota,et al.  How the Apache community upgrades dependencies: an evolutionary study , 2014, Empirical Software Engineering.

[9]  Marco Tulio Valente,et al.  Historical and impact analysis of API breaking changes: A large-scale study , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[10]  Gabriele Bavota,et al.  How do API changes trigger stack overflow discussions? a study on the Android SDK , 2014, ICPC 2014.

[11]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[12]  Ralph E. Johnson,et al.  How do APIs evolve? A story of refactoring , 2006, J. Softw. Maintenance Res. Pract..

[13]  Alistair A. Young,et al.  Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , 2017, MICCAI 2017.

[14]  Marco Tulio Valente,et al.  How do developers react to API evolution? The Pharo ecosystem case , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[15]  Miryung Kim,et al.  An Empirical Study of API Stability and Adoption in the Android Ecosystem , 2013, 2013 IEEE International Conference on Software Maintenance.

[16]  Yuming Zhou,et al.  How Do Developers Fix Cross-Project Correlated Bugs? A Case Study on the GitHub Scientific Python Ecosystem , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[17]  Romain Robbes,et al.  How do developers react to API deprecation?: the case of a smalltalk ecosystem , 2012, SIGSOFT FSE.

[18]  Marco Tulio Valente,et al.  Do Developers Deprecate APIs with Replacement Messages? A Large-Scale Analysis on Java Systems , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[19]  Gerardo Canfora,et al.  Social interactions around cross-system bug fixings: the case of FreeBSD and OpenBSD , 2011, MSR '11.

[20]  Yuefei Liu Understanding and Generating Patches for Bugs Introduced by Third-party Library Upgrades , 2017 .

[21]  Siau-Cheng Khoo,et al.  A discriminative model approach for accurate duplicate bug report retrieval , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[22]  Tom Mens,et al.  When GitHub Meets CRAN: An Analysis of Inter-Repository Package Dependency Problems , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[23]  Ying Wang,et al.  Do the dependency conflicts in my project matter? , 2018, ESEC/SIGSOFT FSE.

[24]  Alexander Serebrenik,et al.  Eclipse API usage: the good and the bad , 2013, Software Quality Journal.

[25]  Brigitta Sipocz,et al.  The Astropy Project: A Community Python Library for Astronomy , 2016 .