Finding Regressions in Projects under Version Control Systems

Version Control Systems (VCS) are frequently used to support development of large-scale software projects. A typical VCS repository of a large project can contain various intertwined branches consisting of a large number of commits. If some kind of unwanted behaviour (e.g. a bug in the code) is found in the project, it is desirable to find the commit that introduced it. Such commit is called a regression point. There are two main issues regarding the regression points. First, detecting whether the project after a certain commit is correct can be very expensive as it may include large-scale testing and/or some other forms of verification. It is thus desirable to minimise the number of such queries. Second, there can be several regression points preceding the actual commit; perhaps a bug was introduced in a certain commit, inadvertently fixed several commits later, and then reintroduced in a yet later commit. In order to fix the actual commit it is usually desirable to find the latest regression point. The currently used distributed VCS contain methods for regression identification, see e.g. the git bisect tool. In this paper, we present a new regression identification algorithm that outperforms the current tools by decreasing the number of validity queries. At the same time, our algorithm tends to find the latest regression points which is a feature that is missing in the state-of-the-art algorithms. The paper provides an experimental evaluation of the proposed algorithm and compares it to the state-of-the-art tool git bisect on a real data set.

[1]  Rakesh M. Verma A General Method and a Master Theorem for Divide-and-Conquer Recurrences with Applications , 1994, J. Algorithms.

[2]  Joseph Robert Horgan,et al.  Incremental regression testing , 1993, 1993 Conference on Software Maintenance.

[3]  C. Michael Pilato,et al.  Version control with subversion - the standard in open source version control , 2008 .

[4]  Andreas Zeller,et al.  Yesterday, my program worked. Today, it does not. Why? , 1999, ESEC/FSE-7.

[5]  Thomas Zimmermann,et al.  Automatic Identification of Bug-Introducing Changes , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[6]  Stephan Merz,et al.  Model Checking , 2000 .

[7]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[8]  Ivana Cerná,et al.  Finding Boundary Elements in Ordered Sets with Application to Safety and Requirements Analysis , 2016, SEFM.

[9]  Michael Bächle,et al.  Ruby on Rails , 2006, Softwaretechnik-Trends.