Indicators for merge conflicts in the wild: survey and empirical study

While the creation of new branches and forks is easy and fast with modern version-control systems, merging is often time-consuming. Especially when dealing with many branches or forks, a prediction of merge costs based on lightweight indicators would be desirable to help developers recognize problematic merging scenarios before potential conflicts become too severe in the evolution of a complex software project. We analyze the predictive power of several indicators, such as the number, size or scattering degree of commits in each branch, derived either from the version-control system or directly from the source code. Based on a survey of 41 developers, we inferred 7 potential indicators to predict the number of merge conflicts. We tested corresponding hypotheses by studying 163 open-source projects, including 21,488 merge scenarios and comprising 49,449,773 lines of code. A notable (negative) result is that none of the 7 indicators suggested by the participants of the developer survey has a predictive power concerning the frequency of merge conflicts. We discuss this and other findings as well as perspectives thereof.

[1]  Sven Apel,et al.  Balancing precision and performance in structured merge , 2014, Automated Software Engineering.

[2]  Michele Lanza,et al.  Syde: a tool for collaborative software development , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[3]  Krzysztof Czarnecki,et al.  An Exploratory Study of Cloning in Industrial Software Product Lines , 2013, 2013 17th European Conference on Software Maintenance and Reengineering.

[4]  Ralph E. Johnson,et al.  Effective Software Merging in the Presence of Object-Oriented Refactorings , 2008, IEEE Transactions on Software Engineering.

[5]  Klaus Kabitzsch,et al.  Automatic variation-point identification in function-block-based models , 2010, GPCE '10.

[6]  Miryung Kim,et al.  Automatic Inference of Structural Changes for Matching across Program Versions , 2007, 29th International Conference on Software Engineering (ICSE'07).

[7]  Andrzej Wasowski,et al.  Forked and integrated variants in an open-source firmware project , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[8]  Sven Apel,et al.  Views on Internal and External Validity in Empirical Software Engineering , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[9]  Marouane Kessentini,et al.  Search-based refactoring detection , 2013, GECCO.

[10]  Khaled El Emam,et al.  The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics , 2001, IEEE Trans. Software Eng..

[11]  Dietmar Pfahl,et al.  Reporting guidelines for controlled experiments in software engineering , 2005, 2005 International Symposium on Empirical Software Engineering, 2005..

[12]  André van der Hoek,et al.  Palantir: Early Detection of Development Conflicts Arising from Parallel Code Changes , 2012, IEEE Transactions on Software Engineering.

[13]  James D. Herbsleb,et al.  Influence of social and technical factors for evaluating contribution in GitHub , 2014, ICSE.

[14]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.

[15]  Christian Bird,et al.  Assessing the value of branches with what-if analysis , 2012, SIGSOFT FSE.

[16]  Mark Carpenter,et al.  The New Statistical Analysis of Data , 2000, Technometrics.

[17]  Julia Eichmann,et al.  Making Software - What Really Works, and Why We Believe It , 2011, Making Software.

[18]  Harald C. Gall,et al.  Architecture Recovery for Product Families , 2003, PFE.

[19]  Jana Schumann,et al.  Confounding parameters on program comprehension: a literature survey , 2015, Empirical Software Engineering.

[20]  Marsha Chechik,et al.  N-way model merging , 2013, ESEC/FSE 2013.

[21]  Ahmed E. Hassan,et al.  Studying the Impact of Social Structures on Software Quality , 2010, 2010 IEEE 18th International Conference on Program Comprehension.

[22]  Sven Apel,et al.  Semistructured merge: rethinking merge in revision control systems , 2011, ESEC/FSE '11.

[23]  Thomas Zimmermann,et al.  Card-sorting , 2016, Perspectives on Data Science for Software Engineering.

[24]  Prasun Dewan,et al.  Semi-Synchronous Conflict Detection and Resolution in Asynchronous Software Development , 2007, ECSCW.

[25]  Chris Verhoef,et al.  Software product line migration and deployment , 2003, Softw. Pract. Exp..

[26]  Mary Czerwinski,et al.  FASTDash: a visual dashboard for fostering awareness in software teams , 2007, CHI.

[27]  António Rito Silva,et al.  Improving early detection of software merge conflicts , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[28]  Sven Apel,et al.  From Developer Networks to Verified Communities: A Fine-Grained Approach , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[29]  Jens Knodel,et al.  Analyzing the Source Code of Multiple Software Variants for Reuse Potential , 2011, 2011 18th Working Conference on Reverse Engineering.

[30]  Banu Diri,et al.  A systematic review of software fault prediction studies , 2009, Expert Syst. Appl..

[31]  Tom Mens,et al.  A State-of-the-Art Survey on Software Merging , 2002, IEEE Trans. Software Eng..

[32]  André van der Hoek,et al.  Palantir: raising awareness among configuration management workspaces , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[33]  Krzysztof Czarnecki,et al.  A survey of variability modeling in industrial practice , 2013, VaMoS.

[34]  Emad Shihab,et al.  An Exploratory Study on Self-Admitted Technical Debt , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[35]  Christian Bird,et al.  Transition from centralized to decentralized version control systems: a case study on reasons, barriers, and outcomes , 2014, ICSE.

[36]  Marsha Chechik,et al.  Managing cloned variants: a framework and experience , 2013, SPLC '13.

[37]  Yuriy Brun,et al.  Proactive detection of collaboration conflicts , 2011, ESEC/FSE '11.

[38]  CatalCagatay,et al.  A systematic review of software fault prediction studies , 2009 .

[39]  Ralf Lämmel,et al.  Flexible product line engineering with a virtual platform , 2014, ICSE Companion.

[40]  David Lo,et al.  Identifying Linux bug fixing patches , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[41]  Mark Staples,et al.  Experiences adopting software product line development without a product line architecture , 2004, 11th Asia-Pacific Software Engineering Conference.