Leaving Behind the Software History When Transitioning to Open Source: Reasons and Implications

Maintenance of software history is regarded to be one of the most relevant features of Version Control Systems (VCS) and is well-known to be indispensable for software developers. However, transitioning from proprietary to open source software poses a challenge: keeping the software history might make available years of historical records and internal matters from the company that built the software. On the other hand, removing the software history may disturb the development and may be harmful to new contributors. We conducted a survey with open source software projects that made this shift to investigate (1) the reasons why they removed the software history and (2) the challenges that developers face with the lack of availability of software history. Among the results, we found that the most common reason for removing the software history is because it is entangled with proprietary code (the fact that the history contains sensitive information appears next). Interestingly, most core developers believed that the lack of software history is, in the worst case, “a very minor inconvenience.”

[1]  Anselm L. Strauss,et al.  Basics of qualitative research : techniques and procedures for developing grounded theory , 1998 .

[2]  Marco Aurélio Gerosa,et al.  Almost There: A Study on Quasi-Contributors in Open-Source Software Projects , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[3]  Daniel M. Germán,et al.  An exploratory study of the evolution of software licensing , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[4]  Marco Tulio Valente,et al.  A novel approach for estimating Truck Factors , 2016, 2016 IEEE 24th International Conference on Program Comprehension (ICPC).

[5]  Leif Singer,et al.  Creating a shared understanding of testing culture on a social coding site , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[6]  Igor Steinmacher,et al.  Who Drive Company-Owned OSS Projects: Employees or Volunteers? , 2017 .

[7]  Márcio Ribeiro,et al.  Unveiling and reasoning about co-change dependencies , 2016, MODULARITY.

[8]  James D. Herbsleb,et al.  Influence of social and technical factors for evaluating contribution in GitHub , 2014, ICSE.

[9]  Marco Aurélio Gerosa,et al.  On the challenges of open-sourcing proprietary software projects , 2018, Empirical Software Engineering.

[10]  Gabriele Bavota,et al.  License usage and changes: a large-scale study on gitHub , 2017, Empirical Software Engineering.

[11]  Daniel M. Germán,et al.  Understanding and Auditing the Licensing of Open Source Software Distributions , 2010, 2010 IEEE 18th International Conference on Program Comprehension.

[12]  Eirini Kalliamvakou,et al.  Open Source-Style Collaborative Development Practices in Commercial Projects Using GitHub , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[13]  Gail C. Murphy,et al.  Do Software Developers Understand Open Source Licenses? , 2017, 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC).

[14]  Igor Steinmacher,et al.  Who drives company-owned OSS projects: internal or external members? , 2018, Journal of the Brazilian Computer Society.

[15]  Katsuro Inoue,et al.  Analysis of license inconsistency in large collections of open source projects , 2016, Empirical Software Engineering.

[16]  Marco Aurélio Gerosa,et al.  More Common Than You Think: An In-depth Study of Casual Contributors , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[17]  Premkumar T. Devanbu,et al.  The missing links: bugs and bug-fix commits , 2010, FSE '10.

[18]  Vijayan Sugumaran,et al.  A framework for creating hybrid‐open source software communities , 2002, Inf. Syst. J..

[19]  Dirk Riehle,et al.  Open Collaboration within Corporations Using Software Forges , 2009, IEEE Software.

[20]  Igor Steinmacher,et al.  Who Gets a Patch Accepted First? Comparing the Contributions of Employees and Volunteers , 2018, 2018 IEEE/ACM 11th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE).

[21]  Karl Fogel,et al.  Producing open source software - how to run a successful free software project , 2005 .

[22]  Barbara Russo,et al.  Co-evolution of logical couplings and commits for defect estimation , 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR).

[23]  Dongmei Zhang,et al.  How do software engineers understand code changes?: an exploratory study in industry , 2012, SIGSOFT FSE.

[24]  Gabriele Bavota,et al.  Machine Learning-Based Detection of Open Source License Exceptions , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[25]  Brian P. Bailey,et al.  Software history under the lens: A study on why and how developers examine it , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[26]  James D. Herbsleb,et al.  Impression formation in online peer production: activity traces and personal profiles in github , 2013, CSCW.

[27]  Gregg Rothermel,et al.  On the benefits of providing versioning support for end users: An empirical study , 2014, TCHI.